Python, Node.js, which is more suitable for writing reptiles?

Simple directional crawling:

Python + urlib2 + RegExp + bs4

Node.js+co, any dom framework or html parser +Request+RegExp is also convenient.

For me, the above two options are almost equivalent, but mainly because I am familiar with JS, and now I will choose more node platforms.

Crawling on the scale of the whole station:

Python + Scrapy

If the DIY spider in the above two schemes is millet plus rifle, then Scrapy is simply a heavy cannon, which is extremely useful, customized crawling rules, http error handling, XPath, RPC, pipeline mechanism and so on. And because Scrapy is based on Twisted, it is also very efficient. Relatively speaking, the only drawback is that the installation is more troublesome and the dependence is stronger. I am still a relatively new osx, so I can't install scrapy directly in pip.

In addition, if xpath is introduced into spider and xpath plug-ins are installed on chrome, the parsing path will be clear at a glance and the development efficiency will be extremely high.

How to unbind the Tian Tian fund account from the bank card?

Measures for the implementation of administrative supervision of social insurance funds in Hunan Province

Some common misunderstandings of fund investment funds

Is Overseas Chinese Holding Group Reliable?

What are the directions of master of history? What department do they belong to? What school recruit?

Correct fund investment skills

What are the four state-owned banks in China?

The state wants farmers to quit their homesteads? Where is the future of farmers? (Compensation standards attached)

Where is the official account of Hunan Personal Social Security WeChat?

How to set a price reduction reminder in JD.COM? Computer output microfilm