needleinthehay.de

Useful Python Packages

Code Quality

Tach

Enforce module dependency constraints.

PyDeps

Visualize module dependency structure.

Code2Flow

Visualize call graphs. Estimates the project structure. (Note: Sometime makes errors when different functions have the same names.)

Data Wrangling

Bonobo

Data Processing and ETL with Python.

Modin

Parallel Pandas for data that doesn’t fit in the memory. Build with Pandas on Ray. (Not for Windows.)

numba

Speed up calculations. (not yet tested)

Machine Learning

General

Statsmodel

Similar to scikit-learn, but closer to R syntax. I like the reports and presentation of results, which are really useful ootb.

CLiPS Pattern

Module for Webcrawling & NLP.

Time series

pyts

Timeseries transformations & classification (not yet tested)

tsfresh

Automated feature extraction from time-series (not yet tested)

Rocket

Fast time series classification (not yet tested)

Darts

Forecasting and anomaly detection on time series (not yet tested)

AutoML & optimization

tpot

SciKit Learn based auto ML library. (not yet tested)

Talos

Hyperparameter scanning and optimization for Keras.

Visualization

d-tale

Visualizing the impact of hyper parameters on the model performance. (not yet tested)

PandasGui

Drag & drop visualizations based on plotly. If dtale is overkill. (not yet tested)

manifold

Visual ML Model debugging. Allows to detect which subset of data a model is inaccurately predicting and explains the potential cause of poor model performance. (not yet tested)

dtreevis

Decision tree visualization and model interpretation.

Charting

Pyxley

Create interactive dashboards and charts. Similar to Shiny for R.

BqPlot

Interactive charting in Jupyter Notebook.

Bokeh

Visualization in IPython Notebooks.

Altair

Visulization in IPython Notebook and more. Using Vega.

Web

Hug

Framework for writign REST-APIs, an alternative Flask which is more prominent. (not yet tested)

Splash

Headless browser for scraping dynamic websites with js etc.

Scrapy

Scrape data from webpages. Take a look at this short intro or this more elaborated tutorial

Helium

Simpler high level API for the headless browser module Selenium. (not yet tested)

FastAPI

Alternative to Flask.

Quant

Async reimplementation of Flask. Supports websockets. (not yet tested)

CLI

Fire

Easy command line interaction. Provides easy & comprehensive use of CLI parameters.

Little helpers

pyvmmonitor

Profile Python programs to identify performance problems and issues in resource usage.

tqdm

Progressbar for CLI and Jupyter Notebooks.

snoop

Powerful debugging tool. (not yet tested)

Jupyter

TensorWatch

Visualize results during Tensorflow Training. (not yet tested)