strace和ldd处理故障一则

问题:php执行脚本报错:
# php /data/web/test.php
Warning: odbc_connect(): SQL error: [unixODBC][Driver Manager]Can't open lib '/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so' : file not found, SQL state 01000 in SQLConnect in /data/web/test.php on line 2  

然而查看系统是存在/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so这个文件的:

# ll /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so 
-rwxr-xr-x 1 root root 47822112 Apr 22  2014 /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so

排查:

第一感觉应该是权限问题,但是从/opt目录开始检查到这个libclouderaimpalaodbc64.so文件,权限都是755,且是使用root用户执行,权限不存在问题; 使用strace跟踪脚本执行:

strace -o err.log php /data/web/test.php  

有如下输出:

open("/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so", O_RDONLY) = 3  
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\1\36\0\0\0\0\0"..., 832) = 832  
fstat(3, {st_mode=S_IFREG|0755, st_size=47822112, ...}) = 0  
mmap(NULL, 35055264, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fc65c764000  
mprotect(0x7fc65e587000, 2093056, PROT_NONE) = 0  
mmap(0x7fc65e786000, 839680, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1e22000) = 0x7fc65e786000  
mmap(0x7fc65e853000, 521888, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fc65e853000  
close(3)                                = 0  
open("/etc/ld.so.cache", O_RDONLY)      = 3  
fstat(3, {st_mode=S_IFREG|0644, st_size=34693, ...}) = 0  
mmap(NULL, 34693, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fc66bfb7000  
close(3)                                = 0  
open("/lib64/tls/x86_64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/lib64/tls/x86_64", 0x7fff0d428510) = -1 ENOENT (No such file or directory)  
open("/lib64/tls/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/lib64/tls", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0  
open("/lib64/x86_64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/lib64/x86_64", 0x7fff0d428510)   = -1 ENOENT (No such file or directory)  
open("/lib64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/lib64", {st_mode=S_IFDIR|0555, st_size=12288, ...}) = 0  
open("/usr/lib64/tls/x86_64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/usr/lib64/tls/x86_64", 0x7fff0d428510) = -1 ENOENT (No such file or directory)  
open("/usr/lib64/tls/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/usr/lib64/tls", {st_mode=S_IFDIR|0555, st_size=4096, ...}) = 0  
open("/usr/lib64/x86_64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  
stat("/usr/lib64/x86_64", 0x7fff0d428510) = -1 ENOENT (No such file or directory)  
open("/usr/lib64/libssl.so.1.0.0", O_RDONLY) = -1 ENOENT (No such file or directory)  

前两行说明系统已经找到这个so文件,且正常读取了832个字节,说明问题不在这个so文件上; 接下来到open("/etc/ld.so.cache", O_RDONLY) = 3这行,开始读取这个动态链接库的动态链接库了; 一大串的找不到libssl信息; 上面的信息很明显了,再用ldd查看下:

# ldd /opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so
        linux-vdso.so.1 =>  (0x00007fff384f1000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f9e30f8d000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f9e30d70000)
        libssl.so.1.0.0 => not found
        libcrypto.so.1.0.0 => not found
        librt.so.1 => /lib64/librt.so.1 (0x00007f9e30b67000)
        libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007f9e3094c000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00007f9e30646000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f9e303c2000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f9e301ab000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f9e2fe17000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f9e3330a000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f9e2fbfd000)
        libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007f9e2f9c5000)
        libfreebl3.so => /lib64/libfreebl3.so (0x00007f9e2f7c2000)

缺少了它所依赖的libssl.so.1.0.0和libcrypto.so.1.0.0 查看到系统安装了0.9.8e和1.0.1e,就是没有1.0.0版本的

# find /usr | grep libssl.so`
/usr/lib64/libssl.so.0.9.8e
/usr/lib64/libssl.so
/usr/lib64/.libssl.so.0.9.8e.hmac
/usr/lib64/.libssl.so.1.0.1e.hmac
/usr/lib64/.libssl.so.6.hmac
/usr/lib64/.libssl.so.10.hmac
/usr/lib64/libssl.so.6
/usr/lib64/libssl.so.1.0.1e
/usr/lib64/libssl.so.10

解决:

用1.0.1e版本的做下软链接就OK了(一般都是向下兼容的)

# ln -sv /usr/lib64/libssl.so.1.0.1e /usr/lib64/libssl.so.1.0.0 
# ln -sv /usr/lib64/libcrypto.so.1.0.1e /usr/lib64/libcrypto.so.1.0.0