What is your environment - tool set & libraries to perform data analysis in Python?



Am using Python for data analysis and wanted to check what environment are people using? Here is a list of tools I am using:

  • Komodo Edit for creating and editing codes
  • Python shell for interactive programming
  • Virtualenv for creating separate environment

Following are the libraries I have used:

  • NumPy
  • SciPy
  • Matplotlib
  • Pandas
  • Scikit-learn

What is your environment like? Any suggestions?


Here are a few additional tools, I would suggest:


  • Personally, I use Sublime Text 3 for editing codes and managing projects. I think it is more handy
  • iPython shell and Notebook - Must have in every data scientist repository for exploratory analysis. This is the default application for exploratory data analysis


  • In addition to virtualenv, I would recommend virtualenvwrapper
  • I have also used Vagrant in Windows setup and would recommend that


  • statsmodels
  • Theano for Neural netowrks
  • Seaborn - a library based on matplotlib for creating awesome visualizations
  • Pydoop for accessing Hadoop through Python
  • re for regular expressions - for data cleaning
  • BeautifulSoup or Scrapy for web scraping
  • json for reading json files


Anaconda. It has almost everything both you and Kunal have listed, and much more including the kitchen sink. If you don’t want to install everything in one go, you can opt for the much smaller Miniconda and customise it to your needs. Although these days, with all the courses in which I have enrolled, I tend to use R a lot more.