Jun 30, 2022
The basic packages for data processing, mathematical and scitific calculations.
Package | Description |
---|---|
pandas | data analysis and manipulation. read/write excel/csv/compressed files, process data, and plot ugly graphs |
numpy | mathematical functions, random number generators, linear algebra routines. |
scipy | optimization, integration, interpolation, eigenvalue problems, algebraic equations, differential equations, statistics |
Package | Description |
---|---|
statsmodels | statistical models and hypothesis tests |
scikit-learn | this is the must-have package for any machine learning project |
xgboost | gradient boosting trees. For a long period of time, it was the Kaggle competition particpants' favorite. |
lightgbm | also gradient boosting trees. it took xgboost's place and became the favorite of Kaggle competition particpants. |
pytorch | I prefer it over keras and tensorflow. The cornerstone for deep learning models. |
these packages help create descent looking graphs
Package | Description |
---|---|
plotly | easy to use and can create some nice graphs |
seaborn | often work hand-in-hand with matplotlib. need some skills to create presentable graphs. |
matplotlib | easy to create simple but ugly graphs with it. but takes real skills to create graphs that can pass . |
Package | Description |
---|---|
Scrapy | for web scraping |
beautifulsoup4 | extract data from html pages |
Package | Description |
---|---|
pytorch-tabnet | based on pytorch. it is becoming very popular on Kaggle |
pytorch-forecasting | it has the potential to be practitioners new favorite. it has some really cool deep learning algorithms. |
pytorch-lightning | if you install pytorch-forecasting, this package along with pytorch will also be installed |
either one of the following is sufficient to do the job
Package | Description |
---|---|
hyperopt | I use it for all my projects. Very powerful package for hyperparameter tunning |
optuna | an alertnative for hyperopt. I don't use it but it is used in pytorch-forecasting package and gets installed when installing pytorch-forecasting |
Package | Description |
---|---|
ta-lib | This is a popular package for engineering technical features for time series data. Note that this package requires Visual Studio Community 2015 be installed on the machine first |
finta | This package is a nice alternative if installing ta-lib is not possible due to various reasons |
tsfresh | tsfresh is used for systematic feature engineering from time-series and other sequential data |
Package | Description |
---|---|
Merlion | This is a descent package for anomaly detection. It is developed by Salesforce. |
Package | Description |
---|---|
yfinance | This is a very neat package that helps downloading stock price data from yahoo finance. |
PyArrow | I use this package to write Parquet file. |
joblib==1.1.0
numpy==1.21.4
scipy==1.7.3
pandas==1.3.4
scikit-learn==1.0.1
lightgbm==3.3.1
xgboost==1.5.1
tsfresh==0.17.0
pytorch-forecasting==0.9.2
pytorch-lightning==1.5.6
hyperopt==0.1.2
mysql-connector==2.2.9
openpyxl==3.0.7
XlsxWriter==3.0.1
xlrd==2.0.1
seaborn==0.11.2
statsmodels==0.13.1
beautifulsoup4==4.10.0
Scrapy==2.5.1
plotly==5.3.1
matplotlib==3.5.0
pytorch-tabnet==3.1.1
optuna==2.10.0
or, if you prefer to run pip one by one, try the following.
pip install pandas==1.3.4
pip install scipy==1.7.3
pip install numpy==1.21.4
pip install scikit-learn==1.0.1
python -m pip install --upgrade pip
pip install lightgbm==2.3.1
pip install xgboost==1.5.1
pip install hyperopt==0.1.2
pip install seaborn==0.11.2
pip install matplotlib==3.5.0
pip install plotly==5.3.1
pip install pytorch-tabnet==3.1.1
pip install pytorch-forecasting==0.9.2
pip install pytorch-lightning==1.5.5
pip install openpyxl==3.0.7
pip install XlsxWriter==3.0.1
pip install xlrd==2.0.1
pip install mysql-connector==2.2.9
pip install beautifulsoup4==4.10.0
pip install Scrapy==2.5.1
pip install notebook==6.4.6
pip3 install torch torchvision torchaudio
pip install tsfresh==0.17.0
pip install statsmodels==0.10.2
pip install yfinance==0.1.66
pip install pyarrow==6.0.1
pip install TA-Lib==0.4.23
pip install h5py==3.7.0
A few more notes
https://files.pythonhosted.org/packages/94/e5/2a808d611a5d44e3c997c0d07362c04a56c70002208e00aec9eee3d923b5/pytorch_tabnet-3.1.1-py3-none-any.whl
- DLL load failure may occur if it is not installed.
- It can be downloaded at https://aka.ms/vs/16/release/vc_redist.x64.exe