WYY's Blog

JUST DO IT


  • Home

  • About

  • Tags

  • Categories

  • Archives

通过运动传感器来追踪移动网页用户:攻击和防御

Posted on 2017-03-27 | In Security & Privacy

本文是2016年NDSS会议,Tracking Mobile Web Users Through Motion Sensors: Attacks and Defenses的笔记
原文链接

1.摘要

通过检测传感器信号中的异常(由制造中的缺陷带来的,就像人类的指纹一样),可以高精度地唯一识别一台设备(手机)。网站的发布者和广告商可以利用这一特性,通过网页、广告、访问来跟踪用户。

攻击方法:

  • 用听不到的音频去激发加速度计和陀螺仪,以得到包含异常的信号。因为是基于物理设备,所有类似清理cookie和隐私模式无法防御这种攻击

防御手段:

  • 校准传感器,消除异常信号。
    1. 加速度计很容易
    2. 陀螺仪校准需要特殊的设备,并且手动校准不是一个有效的方法。
  • 添加噪声进行混淆是更有效的方法
    1. 大量添加和天然异常相似的噪声
    2. 在更大的量级上的做法:增加临时干扰来混淆主要特征的频率

2.数字指纹背景

  1. 美国政府早在1960年,就使用独特的传输特征来跟踪移动发射器。之后还成功利用了无线电信号的频谱特征来区分发射器。
  2. 利用网卡(NICs)制造上微小的缺陷,分析发射信号的无线电频率
  3. 利用独特且不变的电脑时钟偏移,通过TCP和ICMP的时间戳获取
  4. 不同的设备会有不同的软件安装基础

浏览器数字指纹

  1. cookie(清除cookie,隐私浏览模式)
  2. 枚举浏览器的字体和其他浏览器特征(渲染引擎,对于不同JS引擎的表现基准)
  3. UDID和IEME需要特殊的权限

传感器数字指纹

  1. 利用麦克风和扬声器,这需要高权限
  2. Bojinov利用加速度计(需要用户校准加速度计)
  3. Dey利用机器学习的方法分析加速度计(利用手机振动)
  4. 本文利用了加速度计和陀螺仪。

Song提出,降低加速度计的精度来防御窃听。例如报告1g的加速度来隐藏微小的变化值。但是这个方法对于陀螺仪来说,是不够的。

3.运动传感器说明

在电子机械结构上的半点瑕疵都会产生跨芯片的特质。

  • 加速度计:测量的是固有加速度(万有引力作用下的量子的稳定速度)而不是坐标加速度。查分电容正比于所施加的加速度。在制造的时候产生的结构电极间的丝毫隙差会影响产生的电容。电-机械结构中微小的不精确,导致了加速度计芯片微小的不精确。
  • 陀螺仪:使用科里奥利效应来测量角速度。角速度w被施加重量为m的移动质量,并且速度为v,物体会受一个垂直于旋转轴线和速度方向的科里奥利力F=-2mw×v。通过测量陀螺仪内震动已知质量的科里奥利力,可以算出角速度。震动会改变电容,转化成电压信号来测量科里奥利力。

4.特征和分类算法

数据处理

加速度:设a(t)=(ax, ay, az),使|a(t)| = √(ax2, ay2, az2)。
虽然丢弃一些信息,但是使得加速度不依赖于设备的方向。(即使是静止的设备,摆放的方式不同,在三个轴上会有很大的差异,即+-1g)

角速度:w(t)=(wx, wy, wz)

时间和频谱特征

选取了10个时间,15个频谱特征。这些特征在过去的研究中已经得到很好的证明。

类别 特征 描述
时间 Mean 不同时间戳下信号强度的均值
Standard Deviation 信号强度的标准偏差
Average Deviation 均值的平均偏差
Skewness 偏态,均值非对称分布的度量
Kurtosis 峰度,分布平坦或尖刻的度量
RMS 在各种时间戳下信号强度的平方的算术均方根
Max 信号强度的最大值
Min 信号强度的最小值
ZCR 信号由正变负或由负变正的速度
Non-Negative count 非负值的数量
频率 Spectral Centroid 频谱质心:表示功率谱分布的质心
Spectral Spread
Spectral Skewness
Spectral Kurtosis
Spectral Entropy
Spectral Flatness
Spectral Brightness
Spectral Rolloff
Spectral Roughness
Spectral Irregularity
Spectral RMS
Low-Energy-Rate
Spectral flux
Spectral Attack Time
Spectral Attack Slope 谱峰的平均坡度

分类算法和指标

监督学习:支持向量机,朴素贝叶斯分类,多级决策树,K近邻,二次判别分析分类器,Bagged Decision Trees (Matlab’s Treebagger model)

评价指标:Precision(精度),Recall,F-Score(前两者的调和平均)

  • Pri = TPi / ( TPi + FPi )
  • Rei = TPi / ( TPi + FNi )
  • Fi = ( 2 × Pri × Rei ) / ( Pri + Rei )

一个保守的分类器,为了较高的精度,会有较低的Recall。反之亦然。为了获得系统的总体表现,论文计算均值

  • AvgPr = ( sumni=1Pri ) / n
  • AvgRe = ( sumni=1Rei ) / n
  • AvgF = ( 2 × Pri × Rei ) / ( Pri + Rei )

5.数字指纹评价

实验步骤

  1. 利用附录A中的JavaScript代码采集数据。但是受底层操作系统的限制,通过浏览器采集样本的最大频率要比硬件允许的低。
  2. 几种浏览器中选择了google浏览器(样本频率100Hz,允许accelerometer和gyroscope)。
  3. 实验设备有iPhone5,iPhone5s,Nexus S,Galaxy S3,Galaxy S2。背景声音设置为无声,20kHz的超声波,流行音乐。
  4. 每种设置收集10个样本,大约5-8秒有价值的数据。
  5. 没有特殊提及,手机是放在静止平面上的,后有考虑了用户手持的状态。

特征的探索和选择

并不是使用所有特征就行,因为它们之间精度不同,甚至有些是有冲突的。时间特征不需要转换数据流,频谱特征先将不等间隔的数据流转换为用三次样条插值固定间隔的数据流。在8khz采样速率进行插值。利用MIRtoolbox和Libxtract提取频谱特征。利用FEAST toolbox和JMI criterion找最佳组合。
选取的70个特征中,21个来自加速度计,49个来自陀螺仪。时间26个,频谱44个。

结果

  • 实验环境:实验结果还是不错的,声音激活对精确度的效果不明显,但是后面会讲到却可以很好的应对传感器校准或混淆。

  • 公共场合:效果也还可以,90%的精确度

敏感性分析

  • 设备数量的不同:设备增多,准确度会下降。不过亦然能保持在90%以上。
  • 训练集大小不同:训练集比测试集从2:8到8:2,精确度从98%升到99%。
  • 温度的影响:气温变化有10%的影响。
  • 时间稳定性:有影响,精度仍在90%以上。

6.解决方法

校准

Bojinov的仿射误差模型:aM(测量值) = g(放大作用) * a(真实值) + o(误差)

在所有特征中,均值是传感器数据流最能区别的特征,和偏差紧密相关。

校准加速器: 测量值=偏差+增益误差*真实值

计算每个轴上的偏差和增益误差,需要测量该轴正反两个方向的数值。举Z轴的例子:SZ = (aMZ+ - aMZ-) / 2g,OZ = (aMZ+ + aMZ-) / 2

校准陀螺仪: 测量值=偏差+增益误差*真实值

诱导固定的角度变化,可以算出增益误差。保持设备固定,即可算出偏差。 当设备旋转固定角度时,测量值往往偏离真实值。所以计算出的结果,只是实际值的近似值 这会影响任意陀螺仪角位移的测量系统。举Z轴的例子,Oi = (θMi+ + θMi-) / (t1 + t2),Si = (θMi+ - θMi- - Oi(t1 - t2)) / 2π

校准过的数据的数字指纹:
校准对加速度计效果拔群,对陀螺仪效果几乎看不出来。校准之后,音频激发终于对精度有了一个小提升。总的来说,以精准测量为前提的话,校准是个有前途的技术。所以厂家处于保护用户隐私的考虑,最好出厂前校准一下。

数据混淆

和校准相比,混淆增加了额外的噪声,对传感器的使用有负面影响

  • Uniform noise:有范围的最高熵
  • Laplace noise:由不同的微分隐私启发最高熵
  • White noise:影响所有方面的信号

均匀噪声

I.基本混淆

  • 首先:考虑在之前我们观测到的校准误差相似的范围内添加小规模的混淆。在该范围内添加噪声相当于切换到了一个不同的(错误的)校准,因此给用户的影响最小。加速度计的偏差范围[-0.5,0.5],陀螺仪的偏差范围[-0.1,0.1],增益范围[0.95,1.05]。在范围内,随机均匀混淆。有很明显的作用(各种场景下降低了7%-42%)。同时音频刺激有明显效果,应该是一旦主要特征被混淆了,它显著影响次要特征开始发挥作用。

  • 然后:研究发现,对于越大的数据集,混淆作用月明显。

II.增加混淆范围

扩大混淆范围是可以降低精度,但是收益递减。说明简单的混淆原始数据,不足以隐藏所有独特的特征。到目前为止,仅仅操纵信号值,没有改变任何频率特征,所以分类器还能利用频谱特征来唯一区分各个设备。

III.增强混淆

主要的思路是,概率地将当前数据点的修改版本插在在过去时间戳和当前时间戳之间,时间戳本身是随机选择的。这会影响到数据的三次样条插值,进而影响从数据流中提取的频谱特征。

IV影响

利用计步器程序(利用加速器计来算步数)来分析影响。校准,基本混淆,增加范围混淆对计步器程序影响很小,但是增强混淆的影响很大,因此需要寻找新的混淆方式。

拉普拉斯噪声在效用上的影响

采用一种类似的differential privacy方法,在拉普拉斯分布中随机选择偏差和增益误差。
能在隐私和效用之间取得一个较好的平衡。

白噪声在效用上的影响

增加白噪声会有严重的后果,即使是高信噪比。

7.缺陷

  1. 实验设备的数量不够多,但是有在真实环境下的测试。
  2. 校准陀螺仪做的不够好,但是证明了手动校准是容易出错的。

Tracking user with browser' fingerprint

Posted on 2017-03-27 | In Security & Privacy

Single-browser

state of the art

Paper

Beauty and the Beast: Diverting modern web browsers to build unique browser fingerprints

Website

AmIUnique

Features

Attribute Source Function or Example
User agent HTTP header “Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36”
Accept HTTP header “text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8””text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,/;q=0.8”
Content encoding HTTP header “gzip, deflate, sdch, br”
Content language HTTP header “en-US,en;q=0.8”
List of plugins JavaScript navigator.plugins
Platform JavaScript navigator.platform
Cookies enabled JavaScript navigator.cookieEnabled
Do not track JavaScript navigator.doNotTrack
Timezone JavaScript new Date().getTimezoneOffset()
Screen resolution and depth JavaScript screen.width/height/colordepth
Use of local/session storage JavaScript localStorage/sessionStorage
Canvas JavaScript
WebGL Vendor JavaScript canvas.getContext(“…”)code 1
WebGL Render JavaScript canvas.getContext(“…”)code 1
Use of Adblock JavaScript Detect Adblock
List of fonts Sinde-channel List 1 in Cookieless Monstercode 2
List of fonts Flash flash.text.Font.enumerateFonts(true)
Screen resolution Flash flash.system.Capabilities.screenResolutionX/Y
Platform Flash flash.system.Capabilities.os
Language Flash flash.system.Capabilities.language

code 1

1
2
3
4
5
6
7
8
var ctx = canvas.getContext("webgl") || canvas.getContext("experimental-webgl");
if(ctx.getSupportedExtensions().indexOf("WEBGL_debug_renderer_info") >= 0) {
webGLVendor = ctx.getParameter(ctx.getExtension('WEBGL_debug_renderer_info').UNMASKED_VENDOR_WEBGL);
webGLRenderer = ctx.getParameter(ctx.getExtension('WEBGL_debug_renderer_info').UNMASKED_RENDERER_WEBGL);
} else {
webGLVendor = "Not supported";
webGLRenderer = "Not supported";
}

code 2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function get_text_dimensions(font){
h = document.getElementsByTagName("BODY")[0];
d = document.createElement("DIV");
s = document.createElement("SPAN");
d.appendChild(s);
d.style.fontFamily = font;
s.style.fontFamily = font;
s.style.fontSize = "72px";
s.innerHTML = "font_detection";
h.appendChild(d);
textWidth = s.offsetWidth;
textHeight = s.offsetHeight;
h.removeChild(d);
return [textWidth, textHeight];
}

How to detect a specific Chrome extension is installed from regular HTML page

Each submitted extension on Chrome store has a special number.

1
2
3
4
5
6
7
8
9
10
function detectExtension(extensionId, callback) {
var img;
img = new Image();
img.src = "chrome-extension://" + extensionId + "/resources/icon_16.png";
img.onload = function() {
callback(true);
};
img.onerror = function() {
callback(false);
};

Cross-browser

Paper

(Cross-)Browser Fingerprinting via OS and Hardware Level Features

Website

UNIQUEMACHINE

Weakness

  • Small size of the training data
    Only 3,615 fingerprints from 1,903 users within three months.
  • WebGL tasks need a significant time overhead.

Contribution

  • AmIUnique considered WebGL is “too brittle and unreliable”. Because they select a random WebGL task and does not restrict many variables, such as texture, transparency, light, canvas size and anti-aliasing.
  • Some differences between rendering results are very subtle, i.e., with one or two pixel variance.
  • WebGL rendering is a combination of software and hardware in which the hardware contributes more than the software. The uniqueness of software rendering is definitely much lower than the one of hardware rendering but still not zero.

Features

Screen resolution
  • problem: The resolution changes in Firefox and IE when the user zooms in or out the web page.
  • method:
    • Detect the zoom levels based on the size of a div tag and the device pixel ratio, and then adjust the screen resolution correspondingly.
    • The ratio between screen width and height, which does not change with the zoom level.
  • addition:
    • availHeight, availWidth, availLeft, availTop, and screenOrientation.
    • Users may open different browser in different screens which have different resolutions.
Number of CPU virtual cores
  • method: navigator.hardwareConcurrency
  • addition: Safari will cut the number available cores to Web Workers by half.
AudioContext

Peak values and their corresponding frequencies are relatively stable across browsers.

  • paper: Online Tracking:
    A 1-million-site Measurement and Analysis
  • problem: The entropy is much smaller than the entire entropy of the entire wave.
List of fonts
  • problem: Not all fonts are cross-browser fingerprintable because some fonts are web specific and provided by browsers.
Line, curve and anti-aliasing

There are many existing algorithms for anti-aliasing, such as first-principles approach, signal processing approach, and mipmapping, which make anti-aliasing fingerprintable.

Vertex shader and fragment shader

Algorithm differs from one graphic card to another, making texture fingerprintable.

  • Varyings: Provide an interface between Vertex and Fragment Shader. The interpolation algorithm varies in different computer graphics cards.
  • Textures: Give a setting of mapping between vertexes and texture, a fragment shader calculates the color of each pixel based on the texture.
Transparency via Alpha Channel

Because some graphics cards adopt discrete alpha values, some jumps may be observed in the changes of transparency effects.

Image encoding and decoding

Different algorithms may uncover different information during decompression. Both DataURL and JPEG formats are unstable across different browsers, because these formats are with loss and implemented differently in multiple browsers and the server side as well.

  • problem: a single-browser feature, and cannot be used for cross-browser
Installed writing scripts (languages)

A browser with a particular language installed will display the language correctly, and otherwise show several boxes.

WebGL tasks

The size of the canvas is 256×256. The axes of the canvas are defined as follows. [0, 0, 0] is the middle of the canvas, where x-axis is the horizontal line that increases to the right, y-axis is the vertical line that increases to the bottom, and z-axis increases when moving far from the screen. An ambient light with the power of [R: 0.3, G: 0.3, B: 0.3] on a scale of 1 is present, and a camera is placed at the location of [0, 0, -7].

  • Task (a): Texture
    Randomly-generated texture rather than a regular will have more fingerprintable features.

  • Task (b): Varyings

  • Task (b’) Anti-aliasing + Varyings

  • Task (c) Camera(缩小立方体,减少了差异)
    Camera moved to a new location of [-1, -4, 10]

  • Task (d) Lines and curves

  • Task (d’) Anti-aliasing +Anti-aliasing + Lines and curves

  • Task (e) Multi-models
    信息熵比Task a就大了0.01

  • Task (f) Light
    a diffuse, point white light. The power of the light is 2 for each primary color, and the light source is located at [3.0, -4.0, -2.0].
    模型是彩色的,单色光可能会减少一些细微的差异。光照太弱不能照亮模型,太强会让所有都变成白色。位置是随机的。
    信息熵比Task a就大了一点点

  • Task (g) Light and models
    the interaction of a single, diffuse, point light and two models, because one model may create a shadow on another when illuminated by a point light.
    信息熵比Task f就大了一点点信息熵就大了0.03

  • Task (h) Specular Light
    test the effects of a diffuse point light with another color and a specular point light on two models.
    信息熵比Task e大了0.9(f比e大了0.01)

  • Task (h’) Anti-aliasing + Specular Light

  • Task (h”) Anti-aliasing + Specular Light + Rotation
    信息熵减小,稳定性增加,转了一面,信息变少了

  • Task (i) Two Textures(差了,第一层纹理是精心挑选的)
    Add another texture on the multi-models in Task e.

  • Task (j) Alpha

    • many GPUs do not accept smaller steps
    • the Suzanne and sofa models are positioned so that they are partially overlapped
      增加alpha的值,趋势是信息熵变大,但是有反反复复的回滚,原因是software rendering引起的
  • Task (k) Complex lights
    因为有5000多个模型,光的反射又互相影响,所以效果拔群

  • Task (k’) Anti-aliasing + Complex lights

  • Task (l) Clipping plane
    贡献不大

  • Task (m) Cubemap texture + Fresnel effect
    比较好,信息多cube map

  • Task (n) DDS textures
    微软那一套,一些浏览器不支持

  • Task (o) PVR textures
    只支持苹果设备

  • Task (p) Float textures
    比较好,信息多depth

  • Task (q) Video (Animating Textures)
    single-browser的效果好。decoding video is a combination of the browser, the driver, and sometimes the hardware as well.

  • Task (r) Writing Scripts

To be continued

Android Emulator on Ubuntu

Posted on 2017-03-27 | In Security & Privacy

How to Resize Android 4.4 Emulator Internal Storage

Question on StackOverFlow

Problem:

When creating a new Android 4.4 Virtual Device using the AVD Manager, I cannot get the internal storage to be anything larger than 200MB.

Solution:

  • Now that the emulator file system is ext4 I was able to re-size the userdata.img using standard Linux tools.
  • Even above suggestion can cause to android emulator hang on boot logo. The reason is that resize2fs do the changes thats are right in general but considered as broken fs by android and prevent it to mount it in rw mode, that hangs up the boot process.
  • Event e2fsck does not fix it for android and to workaround it i use tune2fs to change the way how android should continue to mount broken fs.
  1. start emulator
  2. cd ~/.android/avd/emulator_name
  3. rm userdata-qemu.*
  4. resize2fs userdata.img 1024M
  5. start than stop emulator
  6. e2fsck -f userdata-qemu.img
  7. resize2fs userdata-qemu.img 1024M
  8. tune2fs -e continue userdata-qemu.img

github&hexo

Posted on 2017-03-27

Setting up a Github Pages blog with Hexo on Mac

Install

Install node.js

1
brew install nodejs

Installing Node.js and NPM is pretty straightforward using Homebrew. Homebrew handles downloading, unpacking and installing Node and NPM on your system. The whole process (after you have XCode and Homebrew installed) should only take you a few minutes.

  • To see if Node is installed, type node -v in iTerm. This should print the version number so you’ll see something like this v7.7.4.
  • To see if NPM is installed, type npm -v in iTerm. This should print the version number so you’ll see something like this 4.1.2.

Install Hexo

1
npm install -g hexo
  • To see if Hexo is installed, type hexo -v in iTerm.

Setup Hexo

Once you’ve got Hexo installed, you can simply run:

1
2
3
$ hexo init username.github.com
$ cd username.github.com
$ npm install hexo-deployer-git --save

Next, we need to configure a couple of things so open up the config file:

1
$ vim _config.yml

Some items you’ll want to immediately change are:

  • title
  • description
  • author
  • url
    To make Github deployments work with hexo deploy you’ll want to make sure that the # Deployment section looks something like:
    1
    2
    3
    4
    5
    6
    # Deployment
    ## Docs: https://hexo.io/docs/deployment.html
    deploy:
    type: git
    repo: https://github.com/username/username.github.io.git
    branch: master

Setup Github

Github provides hosting for static websites via Github Pages. As long as you have an index.html in the root of your repo, Github will serve up the static files for you.

Create a new repository for your blog

You can get this setup by creating the required repository for your new blog, simply create a new repo with the name format as \.github.com, for example: wu-yuanyi.github.com.

Generate some content

Now that everything is assembled in place, it’s time to flesh out the blog. To generate a new post in Hexo simply issue:

1
$ hexo new {postname}

Now you can go ahead and add your content, and then when finished, simply run:

1
2
3
$ hexo clean
$ hexo generate
$ hexo deploy

And your new blog will be live at http://username.github.com

1…56

Yuanyi Wu

朝闻道 夕死可矣

29 posts
9 categories
30 tags
GitHub FB Page
© 2019 Yuanyi Wu
Powered by Hexo
|
Theme — NexT.Mist v5.1.2