###

alias hbin me

Ruby 惰性计算

今天看到一篇关于引入惰性计算, 百倍加速 Lo-Dash 的文章:How to Speed Up Lo-Dash ×100? Introducing Lazy Evaluation

于是 Google 了一下 Ruby 相关的资料,找到了 Enumerator::Lazy 模块。

写了一个 Benchmark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# lazy_benchmark.rb
require 'benchmark/ips'

ARRAY = 1.upto(1_000_000)

Benchmark.ips do |x|
  x.report 'normal' do
    ARRAY.map { |i| i * i }.take(10).to_a
  end

  x.report 'lazy' do
    ARRAY.lazy.map { |i| i * i }.take(10).to_a
  end

  x.compare!
end

Benchmark 结果让我震惊!

1
2
3
4
5
6
7
8
9
10
Calculating -------------------------------------
              normal     1.000  i/100ms
                lazy     4.883k i/100ms
-------------------------------------------------
              normal      9.555  (± 0.0%) i/s -     48.000
                lazy     54.282k (± 3.5%) i/s -    273.448k

Comparison:
                lazy:    54282.1 i/s
              normal:        9.6 i/s - 5681.15x slower

先挖个坑,回头研究下这个惰性计算。

Capistrano && dotenv

今天看到 Hooopo 分享了一篇使用 Heroku 设置环境变量方式来部署的文章写得很棒,有兴趣的朋友请移步:http://www.jianshu.com/p/a80bdfdabce5

文中提到的<The TWELVE-FACTOR>一书简直是软件开发之最佳典范,涵盖代码管理,配置管理,日志管理,持续发布,微服务,多线程/多进程,等等。我差不多每隔段时间就会打开这本书看看,焚香拜读,每次都会有很大收获。

上次看这本书时,我就对 III. Config 一章产生极大兴趣,于是在自己的 Sample App 中用上了.

但是,碰到一个问题,那就是使用 capistrano-sidekiq 部署 Sidekiq 时会报找不到环境变量,从而启动不了。

经过排查发现,其实是 capistrano-sidekiq 的问题,在 Capistrano 中会维护一个 CommandMap,例如:

  {
    :sidekiq => ["rbenv exec", "bundle exec"]
    :sidekiqctl => ["rbenv exec", "bundle exec"]
  }

这里的值我用 Ruby 的 Hash 保存,而事实上这是一个 CommandHash

当 Capistrano 要执行这个命令的时候,只要调用 #to_s 方法就可以得到正确的命令了。

回到我们 dotenv 的部署方式,对于 Rails 应用没有问题,可以顺利加载到 .env 文件,但 sidekiq 却无法加载得到,幸好 dotenv-rails 提供了另一种选择,可以使用 dotenv 命令:dotenv sidekiq

可是在 v0.5.3 版本的 capistrano-sidekiq 中,执行的命令被写死了

这里正确的写法应该是将 sidekiqsidekiqctl 命令增加到 bundle_bins 里,因此我提了一个 PR 给作者。在最新的版本里,这个 PR 也已经被 Merge 了。

最后,我们只要把 dotenv 命令也增加到上面的 CommandMap 里就可以了:

  # https://github.com/hbin/sample_app/blob/master/config%2Fdeploy.rb#L28-L34
  namespace :bundler do
    task :map_bins do
      fetch(:bundle_bins).each do |command|
        SSHKit.config.command_map.prefix[command.to_sym].push('dotenv')
      end
    end
  end

再使用 Capistrano 启动 Sidekiq 的时候,生成的命令就会是:

1
~/.rbenv/bin/rbenv exec bundle exec dotenv sidekiq --index 0 --pidfile /var/www/sample_app_production/shared/tmp/pids/sidekiq-0.pid --environment production --logfile /var/www/sample_app_production/shared/log/sidekiq.log --daemon as deploy@sample.com

到这里,算是把 dotenv 方式部署完全走通了。

:tada: :tada:

CORS(跨域资源共享)解决方案

最近在用前端框架写 dashboard 的时候遇到跨域请求的问题。可能对很多人来说这个问题并不新鲜,可我以前并没有去深入了解,借这个机会总结一下。

为什么会有跨域请求问题?

一个网站可能包含网页,API,以及各种资源文件(css, js, images, fonts)等等,而这些可能会在不同的域名下。以 *.example.com 为例,www 是网页,api 是接口,assets 是各种资源文件。 如果在网页中嵌入 web fonts,或者使用 AJAX 跨域请求时,在同源策略的约束下,这种请求返回会被禁止。

如何解决跨域请求?

事实上,为了解决因同源策略而导致的跨域请求问题,解决方法有五种:
From wiki: https://www.wikiwand.com/en/Same-origin_policy#/Relaxing_the_same-origin_policy

  • document.domain
  • Cross-Origin Resource Sharing(CORS)
  • Cross-document messaging
  • JSONP
  • WebSockets

CORS 定义了一种浏览器和服务器之间是否允许跨站请求的标准。这种方式相对其它的来说更加灵活简单,也是 W3C 推荐的方法。

CORS 是如何工作的?

CORS 标准定义了一组新的 HTTP header,这组 header 给浏览器和服务器提供了一种判断跨域请求是否何法的依据。 因此,要实现 CORS,浏览器(client)和服务器(server)都应该遵守该约定。

  • 浏览器端需要在请求的时候增加一个 Origin 的 HTTP header,值为当前页面的域(domain)。如:http://www.foo.com 的页面要请求 http://www.bar.com 的资源,需带上的 HTTP header 为 Origin: http://www.foo.com
  • 服务器端接收请求,返回的时候需要返回一个 Access-Control-Allow-Origin 的 header 表明哪个域是允许的,如果全都允许,可以使用 * 号。如上例,http://www.bar.com 的返回需要带上 Access-Control-Allow-Origin: http://www.foo.com

服务器端解决方案

几乎所有框架都有现成的库可用,以下只列举我用过的三种:

1) Ruby Framework(Rails, Sinatra, etc)

所有基于 Rack 的 Ruby 框架都可以使用 rack-cors。文档非常详细,不再累述。

2) Flask

Flask 也有一个插件 flask-cors,但是文档很差,Python 社区通病。

1
2
3
4
5
6
7
8
// 1. 安装
$ pip install -U flask-cors

// 2. app.py
from flask.ext.cors import CORS

app = Flask(__name__)
CORS(app)

CORS(app) 方法可接收很多 options 而文档并没有给出,其中比较常用的几个:

  • send_wildcard:True 返回 *False 则返回 Origin 的值
  • supports_credentials:是否允许访问 cookies, 默认为 False
  • resources:可作用的范围,默认为 r'/*';如只允许 /api/* 开头的 URL,则 resources=r'/api'

3) Nginx

Nginx 只需要修改对应 server 的配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
server {
    set $cors "";

    if ($http_origin ~* (.*\.foo.com)) {
        set $cors "true";
    }

    location / {
        if ($cors = "true") {
            add_header 'Access-Control-Allow-Origin' "$http_origin";
            add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS, DELETE, PUT';
            add_header 'Access-Control-Allow-Credentials' 'true';
            add_header 'Access-Control-Allow-Headers' 'User-Agent,Keep-Alive,Content-Type';
        }

        if ($request_method = OPTIONS) {
            return 204;
        }
    }
}

更完整的配置参考这个 Gist: https://gist.github.com/hbin/957e8a8df9eef8ecd36c

以上

人来人往-兼职 Uber 记

Uber 进入北京已经快一年,我这后知后觉的人才开始对这玩意感兴趣起来,于是带着小科,成为了一名人民优步司机。

周三传好资料,周五通过,话说 Uber 效率还是蛮高的。

周六一早,给小科换了一套新买的坐垫(为这个才买的)。吃过午饭,便带着几分好奇,几分激动,打开 Uber Parter 上路了,还没出小区,接到第一单! 乘客在歺照寺,赶到那后,不远处一个小哥给我招的手,小哥块头很大,挤进了前排。

“您好” “您好,去开阳大桥。”“好的,您系好安全带,我导个航” “不用导航了,我认识路,您往前开吧。”

几句对话,我已经有点紧张了,紧张的原因,一是不认识路,二是生怕第一单就没做好。车走过了两个街区,才想起没有滑动开始 -_-!。 路上不堵,开了10多分钟,到了目的地,小哥道谢后便下了车,滑动结束行程后,显示车费15元。我紧张的心情终于平静下来。

看来,成为 Uber 司机还是很简单的。可这地方,人生地不熟,万一接到一单,找不到地怎么办?于是我便空车往回开 -_-!

回家的路有点堵,一来一回已经去了 40 多分钟,回到小区附近又接了一单,很近,只给了起步价。 这下午又堵又热,粗一算,赚的20块(Uber 提成 20%)还不够油费呢。便开车回家了,晚上再出来。

一觉睡到下午四点多,整理了下东西又出去了。有了中午的经验,就不再生疏了,一个晚上共接了8单! 接触到的人都挺有意思的,有拖延症赶着要去看电影的姑娘、一行四个刚硕士毕业的大学生(一路问哪烤串好吃)、满身烟味的北京哥们、健谈的服装小哥(此人做服装行业,可一路跟我讲 Uber怎么比嘀嘀,快的,易到用车等等的好),刚从医院下班的医生,最后一个客人是一位颜值很高的姑娘,说话轻声细语,感觉很有教养,似乎刚回国,打她电话,移动提示我没有开通国际通话业务。

回到家已经是接近12点了,不过我对第一天成果还是很满意的。虽然钱不多,但能接触这么多人,看着他们人来人往,把他们安全送到目的地,换来一声 “谢谢”,还是很欣慰的。

初识 Circus (1)

Circus 是一个用来监控和管理 Process 与 Socket 的 Python 应用。

预备环境

  • Ubuntu 12.04 LTS 或者更高
  • Python 2.7 with pip installed

安装

  1. 首先,通过 PPA 安装:

    1
    2
    3
    
    $ sudo add-apt-repository ppa:roman-imankulov/circus
    $ sudo apt-get update
    $ sudo apt-get install circus
    

  2. PPA 中的版本比较低(0.7.0),再通过 pip 升级:

    1
    
    $ sudo pip install -U circus
    

  3. 修改 /etc/init/circus.conf 文件为:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    
    description "circusd"
    
    start on (net-device-up
             and local-filesystems
             and runlevel [2345])
    stop on runlevel [!2345]
    
    respawn
    
    exec /usr/local/bin/circusd \
           –log-level debug \
           –log-output /var/log/circus.log \
           –pidfile /var/run/circusd.pid \
           /etc/circus/circusd.ini
    

    重启服务 sudo service circus restart

    注意,这里原先的 /usr/bin/circusd 已经被我们改为了 /usr/local/bin/circusd,这是因为通过 pip 升级后地址改变了。

    之所以先用 PPA 安装再通过 pip 升级,是因为 PPA 安装之后,系统会设置好 Upstart/Services。包括以下三个文件:

    • /etc/circus/circusd.ini
    • /etc/init.d/circus
    • /etc/init/circus.conf

Upstart 文件介绍:https://www.digitalocean.com/community/tutorials/the-upstart-event-system-what-it-is-and-how-to-use-it

一个简短的 Python 应用示例

  1. 新建一个 Python 程序:/path/to/the/myprogram.py

    1
    2
    3
    4
    5
    
    import os
    from datetime import datetime
    
    with open(os.path.join(os.path.dirname(file), 'myprogram.log'), 'a') as f:
        f.write("{}: myprogram is running!\n".format(datetime.now()))
    

  2. 新建一个配置文件:/etc/circus/conf.d/myprogram.ini

    1
    2
    
    [watcher:myprogram]
    cmd = python /path/to/the/myprogram.py
    

  3. 重启 Circus 服务

    1
    
    $ sudo service circus restart
    

  4. 通过 tail -f /path/to/the/myprogram.log,观察运行。

一个简短的 Flask 应用示例

  1. 新建一个 Flask 应用:webapp.py,放在 /path/to/myapps/ 下

    1
    2
    3
    4
    5
    6
    7
    
    from flask import Flask
    
    app = Flask(name)
    
    @app.route('/')
    def index():
       return 'Circus Awesome!'
    

  2. 安装 virtualenv 与 Flask

    1
    2
    3
    4
    
    $ cd /path/to/myapps
    $ virtualenv venv
    $ source venv/bin/activate
    $ pip install flask chaussette
    

    Chaussette 是一个 WSGI 服务器。

  3. 新建一个配置文件:/etc/circus/conf.d/webapp.ini

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    
    [watcher:webapp]
    copy_env = True
    virtualenv = /path/to/myapps/venv
    working_dir = /path/to/myapps
    
    use_sockets = True
    cmd = chaussette webapp.app
    args = –fd $(circus.sockets.webapp)
    numprocesses = 5
    
    [socket:webapp]
    host = 0.0.0.0
    port = 5000
    

  4. 重启 Circus 服务

    1
    
    $ sudo service circus restart
    

  5. 访问 http://localhost:5000/

Rails raw, html_safe vs html_escape(h) and benchmark

  • raw is a wrapper around String#html_safe.
  • String#html_safe just returns an instance of ActiveSupport::SafeBuffer.

@Daniel wrote a post about when to use raw() and when to use .html_safe

  • html_escape originally defined in ERB::Util.html_escape, also aliased as: h

There are several html escaption methods, here is the benchmark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
require 'benchmark/ips'
require 'open-uri'

require 'cgi'
require 'erb'
require 'rack'

puts "===== Short String =====\n\n"

Benchmark.ips do |x|
  SHORT_STR = %(<html><head></head><body></body></html>)

  x.report 'CGI::escapeHTML' do
    CGI::escapeHTML SHORT_STR
  end

  x.report 'ERB::Util.html_escape' do
    ERB::Util.html_escape SHORT_STR
  end

  x.report 'Rack::Utils.escape_html' do
    Rack::Utils.escape_html SHORT_STR
  end

  x.compare!
end

puts "===== Long String =====\n\n"

Benchmark.ips do |x|
  LONG_STR  = open('http://example.com/').read

  x.report 'CGI::escapeHTML' do
    CGI::escapeHTML LONG_STR
  end

  x.report 'ERB::Util.html_escape' do
    ERB::Util.html_escape LONG_STR
  end

  x.report 'Rack::Utils.escape_html' do
    Rack::Utils.escape_html LONG_STR
  end

  x.compare!
end

require 'active_support/core_ext/string'

puts "===== Short html safe string =====\n\n"

Benchmark.ips do |x|
  SHORT_HTML_SAFE_STR = %(<html><head></head><body></body></html>).html_safe

  x.report 'CGI::escapeHTML' do
    CGI::escapeHTML SHORT_HTML_SAFE_STR
  end

  x.report 'ERB::Util.html_escape' do
    ERB::Util.html_escape SHORT_HTML_SAFE_STR
  end

  x.report 'Rack::Utils.escape_html' do
    Rack::Utils.escape_html SHORT_HTML_SAFE_STR
  end

  x.compare!
end

puts "===== Long html_safe String =====\n\n"

Benchmark.ips do |x|
  LONG_HTML_SAFE_STR  = open('http://example.com/').read.html_safe

  x.report 'CGI::escapeHTML' do
    CGI::escapeHTML LONG_HTML_SAFE_STR
  end

  x.report 'ERB::Util.html_escape' do
    ERB::Util.html_escape LONG_HTML_SAFE_STR
  end

  x.report 'Rack::Utils.escape_html' do
    Rack::Utils.escape_html LONG_HTML_SAFE_STR
  end

  x.compare!
end

__END__


===== Short String =====
Comparison:
ERB::Util.html_escape: 113217.7 i/s
CGI::escapeHTML: 110218.2 i/s - 1.03x slower
Rack::Utils.escape_html: 81503.8 i/s - 1.39x slower

===== Long String =====
Comparison:
ERB::Util.html_escape: 25110.7 i/s
CGI::escapeHTML: 24430.1 i/s - 1.03x slower
Rack::Utils.escape_html: 16207.2 i/s - 1.55x slower

===== Short HTML Safe String =====
Comparison:
ERB::Util.html_escape: 2772776.1 i/s
CGI::escapeHTML: 106256.2 i/s - 26.10x slower
Rack::Utils.escape_html: 72086.8 i/s - 38.46x slower

===== Long HTML Safe String =====
Comparison:
ERB::Util.html_escape: 2749941.1 i/s
CGI::escapeHTML: 24777.1 i/s - 110.99x slower
Rack::Utils.escape_html: 16229.5 i/s - 169.44x slower

Rails HTTP Request IDs

The HTTP Request IDs makes it easy to trace requests from end-to-end in the stack and to identify individual requests in mixed logs like Syslog.

Rails 3.2 introduced the ActionDispatch::RequestId middleware that make a unique X-Request-Id header avariable to the response.

For example: curl -I http://sample.dev/

1
2
3
4
5
6
HTTP/1.1 200 OK
...
X-Request-Id: ddaa28e2-3395-4e66-9ca7-48480882a1df
X-Runtime: 0.045724
Date: Fri, 04 Jul 2014 03:23:43 GMT
Connection: close

To show this request ID in with your application logs, add this line to your config/environments/production.rb

1
config.log_tags = [:uuid]

The logs will then be tagged with the Request ID:

1
2
3
4
5
6
7
I, [2014-07-04T11:23:43.752028 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df] Started HEAD "/" for 127.0.0.1 at 2014-07-04 11:23:43 +0800
I, [2014-07-04T11:23:43.755366 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df] Processing by StaticPagesController#home as */*
I, [2014-07-04T11:23:43.759832 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df]   Rendered static_pages/home.html.erb within layouts/application (0.2ms)
I, [2014-07-04T11:23:43.791258 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df]   Rendered layouts/_shim.html.erb (0.0ms)
I, [2014-07-04T11:23:43.793947 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df]   Rendered layouts/_header.html.erb (0.3ms)
I, [2014-07-04T11:23:43.796489 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df]   Rendered layouts/_footer.html.erb (0.2ms)
I, [2014-07-04T11:23:43.797112 #33305]  INFO -- : [ddaa28e2-3395-4e66-9ca7-48480882a1df] Completed 200 OK in 42ms (Views: 41.2ms | ActiveRecord: 0.0ms)

References:

  1. http://guides.rubyonrails.org/3_2_release_notes.html#action-dispatch
  2. https://devcenter.heroku.com/articles/http-request-id

Microsoft Excel Issues "File error: data may have been lost"

Last week, We involve a function for users to exports their orders into a spreadsheet document.

We using the spreadsheet gem to modify an existing template spreadsheet. But when opening the modified spreadsheets by Microsoft Excel, it issues an error:

File error: data may have been lost

After some google search, I was answered to set the encoding explictly:

1
Spreadsheet.client_encoding = 'UTF-16LE'

But it doesn’t works, WTF!

Finally, I decide to create a new spreadsheet and set cell format by hand instead of modify the existing template spreadsheet document. It’s a painful task. The spreadsheet comes up just fine without error.

Even though it doesn’t elegant, we can get ride of that annoying error.

Resolving Vim Key Mapping Conflict

Ack.vim is a plugin for the Perl CLI script ack which is a replacement of grep. It provides a front for running ack from vim.

By default, It will search recursively under the current directory. It’s not convenient to search of a project.

Fortunately, there is a plugin vim-rooter which will changes the working directory to the project root automatically.

I use Janus for my Vim. So I git clone it to ~/.janus directory. It works great, but if I open Vim from the terminal, I got a key mapping conflict error message:

1
2
3
Error detected while processing /Users/hbin/.janus/vim-rooter/plugin/rooter.vim:
line  159:
E227: mapping already exists for ,cd

That’s because Janus mapped <leader>cd to changes the path to the active buffer’s file, and the vim-rooter also try to map <leader>cd to <Plug>RooterChangeToRootDirectory.

Here is the source

1
2
3
if !hasmapto("<Plug>RooterChangeToRootDirectory")
  map <silent> <unique> <Leader>cd <Plug>RooterChangeToRootDirectory
endif

The Solution is simple, just create a mapping to <Plug>RooterChangeToRootDirectory:

1
nnoremap <leader>cr <Plug>RooterChangeToRootDirectory

References:

  1. http://vim.wikia.com/wiki/Mapping_keys_in_Vim_-_Tutorial_(Part_1)
  2. http://stackoverflow.com/questions/3776117/what-is-the-difference-between-the-remap-noremap-nnoremap-and-vnoremap-mapping

Enjoy!

The smart-shift package released!

smart-shift is a minor mode for conveniently shift the line/region to the left/right by the current major mode indentation width.

Installation

Melpa

Once you have setup Melpa you can use package-install command to install. The package name is smart-shift.

Manual

1
2
3
(add-to-list 'load-path "/path/to/smart-shift")
(require 'smart-shift)
(global-smart-shift-mode 1)

Customizing

smart-shift will infer the indentation level of current major mode, if none of major modes listed below match, use the tab-width as default.

It can also be set to a number explictly.

1
(setq smart-shift-indentation-level 2)

Or, for some major mode we haven’t support, add following snippets to your config file. Test it and send a PR.

1
2
3
4
(eval-after-load 'your-major-mode
  '(progn
     (add-to-list 'smart-shift-mode-alist
                  '(major-mode-or-derived-mode . customize-base-offset))))

Supported major modes

  • lisp-mode
  • emacs-lisp-mode
  • c-mode
  • c++-mode
  • objc-mode
  • java-mode
  • idl-mode
  • pike-mode
  • awk-mode
  • ruby-mode
  • python-mode
  • swift-mode
  • js-mode
  • js2-mode
  • coffee-mode
  • css-mode
  • scss-mode
  • slim-mode
  • html-mode
  • web-mode
  • sh-mode
  • yaml-mode
  • text-mode
  • markdown-mode
  • fundamental-mode

Interactive commands

Command Keybinding Description
smart-shift-left C-c [ Shift the line or region ARG times to the left.
smart-shift-right C-c ] Shift the line or region ARG times to the right.

After invoking smart-shift-left or smart-shift-right the first time, you can simply hit [ or ] to continuously shift to left or right, respectively.

If you use the key-chord like me. I strongly recommend you add the following snippets:

1
2
(key-chord-define-global "<<" 'smart-shift-left)
(key-chord-define-global ">>" 'smart-shift-right)

Contribute

Repo is here, forks and pull requests are welcome!