systemtap学习笔记

安装

centos
  • 依赖条件

    1. kernel-devel
    2. kernel-debuginfo
  • 安装

yum install kernel-devel  
wget -c http://debuginfo.centos.org/6/x86_64/kernel-debuginfo-common-x86_64-$(uname -r).rpm  
wget -c http://debuginfo.centos.org/6/x86_64/kernel-debuginfo-$(uname -r).rpm  
rpm -ivh kernel-debuginfo*.rpm  
yum install systemtap  
ubuntu

需要使用一个脚本来下载安装dbgsym

apt-get install linux-headers  
wget http://www.domaigne.com/download/tools/get-dbgsym  
bash get-dbgsym  
apt-get install systemtap  

使用systemtap

  • 编写systemtap脚本
//hello-world.stp
probe begin  
{
    print("hello world\n")
    exit()
}
  • 使用stap执行
# stap -v hello-world.stp 
Pass 1: parsed user script and 103 library script(s) using 201384virt/29244res/3140shr/26608data kb, in 320usr/20sys/339real ms.  
Pass 2: analyzed script: 1 probe(s), 1 function(s), 0 embed(s), 0 global(s) using 201912virt/30092res/3428shr/27136data kb, in 10usr/0sys/9real ms.  
Pass 3: translated to C into "/tmp/stapwZDFEH/stap_a02887fb033e91cffb4ad1fbb8f58385_957_src.c" using 201912virt/30480res/3784shr/27136data kb, in 0usr/0sys/0real ms.  
Pass 4: compiled C into "stap_a02887fb033e91cffb4ad1fbb8f58385_957.ko" in 1850usr/570sys/2316real ms.  
Pass 5: starting run.  
hello world  
Pass 5: run completed in 0usr/30sys/329real ms.  

systemtap数据流图

流程如下:解析stap文件,生成对应的c代码,然后将其编译为一个内核模块,并加载到内核中,当事件发生时执行相关的探测操作,输出侦探结果。


SystemTap脚本基础

基础语法

  • global VAR1, VAR2 声明全局变量
  • probe PROBE { HANDLER } 定义探测器及其处理方式
  • function FUNC(ARG1, ARG2, ...) { BODY } 定义全局函数
  • 一个简单的例子说明
//全局变量
global start_ts, end_ts  
//全局函数
function report()  
{
    time_pass = end_ts - start_ts
    printf("time pass: %d seconds and %d milliseconds\n", time_pass / 1000, time_pass % 1000)
}
//probe
probe begin  
{
    //全局变量可以在probe内部使用
    start_ts = gettimeofday_ms()
}
//probe
probe timer.s(2)  
{
    exit()
}
//probe
probe end  
{
    end_ts = gettimeofday_ms()
    //全局函数可以在probe内部使用
    report()
}

数据类型和操作符

  • long 长整型支持的算术操作符:
/ * % + - >> << & ^ | && || = *= /= %= += -= >>= <<= &=
^= |= < > <= >= == !=
  • string 字符串

    1. 支持字符串联接. .=
    2. 支持字符串比较< > <= >= == !=
  • associative arrays 关联数组(全局变量),可以通过longstring或者两者的组合来索引,数组的值可以是longstringstatistics类型,如果熟悉awk,可以发现与awk的关联数组的用法其实是一样的,举个例子来说明下关联数组的用法:reads数组使用execname()pid()同时作为数组下标,数组中每个成员对应一个数值;如果单用execname()来作为数组下标的话,上面输出中的多个beam.smp进程都当成了一个数组成员;而如果用pid()作为数组下标,则可读性太差,不知道进程名。

global reads  
probe syscall.read  
{
    reads[execname(), pid()]++
}
probe timer.ms(5000), end  
{
    foreach([name, pid] in reads- limit 10)
        printf("%s[%ld] reads %d times\n", name, pid, reads[name, pid])
    delete reads
}
beam.smp[9576] reads 878 times  
beam.smp[19243] reads 127 times  
beam.smp[19823] reads 120 times  
beam.smp[1744] reads 100 times  
beam.smp[19080] reads 93 times  
beam.smp[24735] reads 90 times  
beam.smp[20158] reads 74 times  
beam.smp[28295] reads 66 times  
beam.smp[25073] reads 52 times  
stapio[13898] reads 26 times  
  • aggregates 统计数据(全局变量),通过<<<操作符来累加long类型 用aggregates的好处在于高效,可以使用一系列的统计函数(见常用内置函数一节),举个例子说明:这个脚本的输出跟上面数组的例子是一样的结果
global reads  
probe syscall.read  
{
        reads[execname(), pid()] <<< 1
}
probe timer.ms(5000), end  
{
        foreach([name, pid] in reads- limit 10)
            printf("%s[%ld] reads %d times\n", name, pid, @count(reads[name, pid]))
        delete reads
}

脚本语句

  • 语句分隔符为分号;,可写可不写
  • 使用{}为语句分组
  • 分支
if (COND) STMT [else STMT]  
  • 循环
while (COND) STMT  
for (INIT; COND; ITER) STMT  
foreach (VAR in ARRAY [limit NUM]) STMT  
foreach ([VAR1, VAR2] in ARRAY [limit NUM]) STMT  
break; continue;  
  • 主要介绍下foreach语法:foreach是用来历遍数组的,支持排序和限制输出个数: 降序:foreach(VAR in ARRAY-) 升序:foreach(VAR in ARRAY+) 只输出前10个:foreach(VAR in ARRAY- limit 10)

  • 其他

return [VAL];  
next;  
delete VAR;  // 相当于VAR=0或VAR=""  

脚本变量

  • 变量可以从命令行中读取

    $1 .. $N 获取数值型变量 @1 .. @N 获取字符型变量 $# 变量个数 或 @# 变量个数(字符串)

  • 举例:

global array  
probe syscall.*  
{
    if(pid()==$1)
        array[pp()]++
}
probe timer.ms(4000)  
{
    exit ()
}
probe end  
{
    foreach(i in array)
        printf("%s\t%d\n", i, array[i])
}
# stap strace-beam.stp 28295
# 输出结果
kernel.function("sys_epoll_wait@fs/eventpoll.c:1710").call?     26827  
kernel.function("sys_futex@kernel/futex.c:2692").call?  519  
kernel.function("sys_sched_yield@kernel/sched.c:7350").call     9034  
kernel.function("sys_write@fs/read_write.c:407").call   9  
kernel.function("sys_read@fs/read_write.c:389").call    18  
kernel.function("sys_munmap@mm/mmap.c:2287").call       4  
kernel.function("sys_mmap@arch/x86/kernel/sys_x86_64.c:87")?    4  
kernel.function("sys_writev@fs/read_write.c:733").call  1