Archive

Archive for November, 2012

如何写一篇没有节操的计算机科学论文

November 30th, 2012 No comments

本文是对这篇论文[1]内容的介绍。

做编译性能优化的同学,免不了要面对Benchmark这个东西。一个idea实现出来,Benchmark上跑一跑,如果性能提升了百分之几,那就是莫大的宽慰,可以写论文了。不过大多数的时候,性能提升是看人品的事情,性能下降的反而居多。这个时候,不要慌,不要愁,把链接次序改一改,多添加几个环境变量,见证奇迹的时刻到了:看看是不是性能的对比就不一样了?多试几次,最多会有5%~10%的性能改变,有时高有时低。这时,没有节操的你,就可以跳出对你最有利的链接次序和环境设置,放在自己的论文里面了。

开个玩笑。你肯定不是没有节操的人,但是严谨的你需要阅读一下Todd Mytkowicz等人的论文,防止自己在实验环节出现“Measurement Bias”错误。计算机发展到现在已经变得非常复杂,各种看起来微不足道的因素都有可能影响到你的性能测试结果。只有在实验的时候充分的考虑到这些可能的因素并积极的避免,才可能会让自己的实验结果更加的可信。

原论文的摘要:

This paper presents a surprising result: changing a seemingly innocuous aspect of an experimental setup can cause a systems researcher to draw wrong conclusions from an experiment. What appears to be an innocuous aspect in the experimental setup may in fact introduce a significant bias in an evaluation. This phenomenon is called measurement bias in the natural and social sciences.

Our results demonstrate that measurement bias is significant and commonplace in computer system evaluation. By significant we mean that measurement bias can lead to a performance analysis that either over-states an effect or even yields an incorrect conclusion. By commonplace we mean that measurement bias occurs in all architectures that we tried (Pentium 4, Core 2, and m5 O3CPU), both compilers that we tried (gcc and Intel’s C compiler), and most of the SPEC CPU2006 C programs. Thus, we cannot ignore measurement bias. Nevertheless, in a literature survey of 133 recent papers from ASPLOS, PACT, PLDI, and CGO, we determined that none of the papers with experimental results adequately consider measurement bias.

Inspired by similar problems and their solutions in other sciences, we describe and demonstrate two methods, one for detecting (causal analysis) and one for avoiding (setup randomization) measurement bias.

[1]: Producing wrong data without doing anything obviously wrong

附:标题抄袭了果壳/mihir0的这篇文章:

要显著,不要节操——如何写一篇节操丧尽的心理学论文

摘录:充满幽默感的论文摘要

November 30th, 2012 No comments

原文见这里

Using code examples in professional software development is like teenage sex. Those who say they do it all the time are probably lying. Although it is natural, those who do it feel guilty. Finally, once they start doing it, they are often not too concerned with safety, they discover that it is going to take a while to get really good at it, and they realize they will have to come up with a bunch of new ways of doing it before they really figure it all out.

IonMonkey 中可能的研究点

November 16th, 2012 No comments

一周前,一位巴西的大四学生给 Mozilla JS-Internals 邮件列表发了一封邮件[1],说自己下半年就开始计算机科学的研究生学业了,希望能够在 IonMonkey 上做些研究,但是刚接触 IonMonkey 没有什么感觉,希望能够得到一些指点。今天 Mozilla JS Engine 的负责人 David Anderson 回复了他,指出了几个他们感兴趣的研究项目[2],有兴趣的读者可以关注一下:

  1. Escape Analysis(逃逸分析):目前还没有任何的工作,所以即使不是完整的算法实现,能够得到一些测试数据也是很好的。逃逸分析能够帮助减少冗余的堆内存占用(当一个线程中的堆内存对象不确定是否被其它线程引用的时候是不能轻易的删除的)。
  2. Better Alias Analysis(别名分析):目前 IonMonkey 中有一个别名分析(位于 js/src/ion/AliasAnalysis.{h,cpp}),但是比较的粗糙,例如在遇到类似“v.x + v.y + v.z”这样的表达式时,现在的别名分析会将 v.x 和 v.z 都看成是 v.y 的别名。这阻碍了后续的优化工作。
  3. RA Improvements(寄存器分配算法的改进):要重写一个 RA 是非常难的,工作量也非常的大。如果能够在现在 RA 实现的基础上做一些改进,也是很有意义的。
  4. Control-flow Elimination(不常用控制流消除):目前 IonMonkey 能够消除(eliminate)单个指令,但是无法消除 CFG 中的 Block 。如果这个功能实现了,我们(开发人员)就可以做进一步的实验,尝试更加激进的优化,消除掉不常用的分支,或许还可以促进 RA 的效果。

目前 IonMonkey 还在开发中,支持的分析和优化还不是很多,实现上也是比较简单的实现,应该还有不少的机会。

Reference:

[1]: http://www.mail-archive.com/dev-tech-js-engine-internals@lists.mozilla.org/msg00120.html

[2]: http://www.mail-archive.com/dev-tech-js-engine-internals@lists.mozilla.org/msg00122.html