Python Selenium

  1. Install: selenium,browser drive
  2. browser
  3. 查找元素:find_elements/find_elements_by_xxx,find_element/find_element_by_xxx
  4. 交互操作:action(eg: click,key_down,...),action_chains (ActionChains,drag_and_drop)
  5. 执行javascript: execute_script(...)
  6. 切换:switch_to.xxx,back/forward()
  7. 异常处理:selenium.common.exceptions(eg: TimeoutException, NoSuchElementException)
  8. Cookie: add/get/delete_cookie(...),get_cookies(),delete_all_cookies()
  9. 等待元素: 强制等待 time.sleep(seconds), 隐式等待 browser.implicitly_wait(seconds),显示等待 WebDriverWait,expected_conditions

Read More >>

Python 爬虫框架Scrapy

  • Scrapy架构,常用命令,文档解析
  • Spider: Spider,CrawlSpider,XMLFeedSpider,CSVFeedSpider,SitemapSpider
  • Item,ItemLoader
  • Middleware: ItemPipeline,Spider/DownloaderMiddleware,Item Exporters
  • 应用示例:基于Excel爬取,Login,常见问题
  • Scrapy-Redis 分布式架构(共享抓取队列)
  • Scrapyd(分布式发布)

Read More >>

Python 基础

  1. Env & Tools: python(+pip),ipython,Anaconda(+conda),PyCharm
  2. 基础:keywords,comment,input,print,operation,if-else,for,while,try...except,sys.argv,unittest
  3. 基础数据类型: 不可变对象(int,float,str,tuple,) & 可变对象(list,set,dict),垃圾回收机制,类型转换,位运算
  4. 函数: 局部/全局变量,默认/不定长参数,闭包,列表生成式,生成器,迭代器,装饰器,匿名函数,常用内建函数
  5. 类: 类属性/方法/静态方法,实例属性/方法,继承与多态,元类,枚举类,单例模式,内置类属性,定制类
  6. 库:模块与包,导入,自定义库,发布/安装,常用标准库,常用扩展库
  7. 文件IO:读写文件,操作文件/目录
  8. 多任务:进程,线程,协程
  9. 访问数据库:SQLite,MySQL,Redis,MongoDB

Read More >>

消息中间件 RabbitMQ

  1. AMQP: Advanced Message Queuing Protocol
  2. RabbitMQ: Producer -> VirtualHost(Exchange -> Binding:routingKey,headers,all,... -> Queue) -> Consumer
  3. 使用Docker安装RabbitMQ
  4. Demo: Direct/Topic/Headers/Fanout Exchange
  5. Demo: Exchange object
  6. Demo: Reliability send

Read More >>

RPC:Dubbo

  1. RPC 远程过程调用
    • 四个核心组建(Client,Server,Client Stub,Server Stub)
    • 框架选择的关注点(I/O,线程,序列化,多语言,服务治理)
    • 流行框架(Dubbo,Motan,Thrift,Grpc)
  2. Dubbo: Provider,Consumer,Registry,Monitor,Container
  3. HelloWorld
  4. Demo(with SpringBoot)

Read More >>

一致性服务 Zookeeper(CuratorFramework)

  1. dependency: curator-recipes
  2. Start/Close: CuratorFrameworkFactory.builder()...build(), start(),close()
  3. CRUD: create(),getData(),setData(),delete(),checkExists(),getChildren()
  4. Watch: usingWatcher(watcher),NodeCache,PathChildrenCache,TreeCache
  5. ACL: schema:id:permission, new ACL(perms,id),.aclProvider(aclProvider),.authorization(authInfoList),.withACL(aclList),get/setACL()
  6. 一个应用示例:Client端监控文件增删,实现文件从Server端到Client端的同步

Read More >>