Ever since Ricky joined our team, we have fallen in love with the programming language Python. After Josh joined, we fell even more in love. For anyone that takes interest in crawling the web, analyzing data or building applications, Python brings you some wicked powerful tools that are simple to use.
Josh found out that Big Apple Py was hosting an event here in New York City, and we couldn’t resist spending a beautiful August weekend (in a windowless midtown conference center) with the local Python enthusiasts. And we were right. We heard some insightful talks from some pretty bright engineers over at companies like Spotify, Venmo, and Bitly. The event was sponsored by some of the most admirable companies including: Github, BuzzFeed, DigitalOcean, Rackspace, Twilio, AppNexus, and Chartbeat.
Our journey began with an awesome debugging workshop taught by @amygdalama called Python Wats: Uncovering Odd Behavior. Amy is a Hacker School alum and engineer at Venmo. The session was filled with 90 slides of Python trivia, tips, tricks, and fun facts. Check it out on SlideShare.
@ericschles, Adjust Professor at NYU presented How I use Python to Fight Human Trafficking. Schles is an economist and computer scientist that has spent his career studying the worldwide epidemic of human trafficking. He is working on creating a global missing persons database. The internet has fueled the growth of the sex trade to an all time high, an estimated $100b per year (that’s about twice Google’s annual revenue). Schles is on a mission to hack the internet to fight back, and teaches us some of the techniques he uses in the field such as web scraping and image recognition. He crawls sites like Craigslist and Backpages, and compares images against those he extracts from the National Center for Missing and Exploited Person‘s website. Schles uses requests, lxml, xpath, selenium, nltk, cv2, and ipython to hunt down these predators. His work has lead to the capture and arrest of 3 offenders so far, and he’s just beginning. You can check out his presentation and contribute to his project at https://github.com/EricSchles/.
Some other interesting sessions we attended include:
- Architectural evolution in startups by @martinmelin. Martin is an engineer at ecommerce platform Tictail, and goes over some of the lessons he learned scaling out the software infrastructure as the company expanded after their latest round of funding (total $10.6m).
- Python Begets Python: BattleSchool Provisioning, via Ansible, can Self-Document and Configure a Mac to get Productive faster and thus Produce Working Software faster. Anne Moroney teaches us BattleSchool provisioning with Ansible to configure a Mac to produce working software faster.
- Asyncronous Web Scraping with Asyncio by @burgraa. Bugra gets us excited about the new asyncio module available in the standard library of Python 3.4. We can now process and scrape data a synchronously without having to use any third party imports. Bugra walks us through the new module and answers questions about it.
- Macro Scaling via Microservices presented by Peter Herndon, Sr. Engineer at bitly. Peter gives us an overview of the service oriented architecture used over there. He explains how tornado and nsq are configured to handle billions of messages per day. A tremendous amount of web traffic is routed through bitly, which maintains reliability and scale through a distributed architecture.
- Enough Machine Learning to Make Hacker News Readable Again by @nedjl. Ned is a principal engineer over at Spotify. He uses machine learning to help filter the ‘dreck’ from Hacker News, and figure out what content is most likely to perform well. scikit-learn and nltk do a lot of the heavy lifting to abstract away complex grammar and mathematics. The content is separated into two buckets – ‘dreck’ (yiddish for crap) and ‘non-dreck’ (his ‘dreck’ bucket contains about 18% of all HN content). Factors such as TLD of the article are considered, as well as other attributes, using string matching. Alot of time is saved not having to read the ‘dreck’, and his algorithms can also help predict what type of content is likely to get upvoted (which is useful when submitting). You can view Ned’s Hacker News dreck filter at http://hn.njl.us/.
- How to write actually object-oriented python by Per Fagrell . This session highlights proper use of procedural and structured paradigms. The SOLID and Tell-Don’t Ask principals are presented along with examples of proper use.
- Setting up your Python development environment in IPython by @kronosapiens. pip install ipython in the terminal for an exciting new REPL experience with some really cool features. Daniel highlights some useful tips in ipython that will certainly improve our workflow, such as being able to access previous commands as though they were objects in a dictionary. You can also access docstrings, function definition prototypes, source code, source files and other details of any object accessible to the interpreter with a single character.
- Python and Julia. Why do we need another language? by Dwight J. Browne. Dwight has a background in FORTRAN and C having worked in the financial services sector since the nineties. He is excited about Julia, as a way to provide tremendous performance increases with the ease of python. He argues the case why Python enthusiasts should explore it further.
- PyParallel by @trentnelson. The founder of this research project aims to bring the power of Windows I/O Completion Ports to Python for high performance and asynchronous support. Trent enlightens us on the fundamental problems *nix has with multi-threading in contrast to Windows, and how Python is designed around the idea of synchronous, non-blocking I/O.
Thumbs up to Big Apple Py, the organizers, presenters, sponsors for putting on a successful and informative conference. We look forward to the next one!