Category Archives: Python

Install Python 3 coexisting with Python 2 in Linux

In this post I’m going to explain how to install Python 3 coexisting with Python 2 in Linux (Kubuntu 13.10 in my case). First step is install some dependencies.

After that, download from the website and uncompress the latest version of Python 3 in a folder of your choice.

Inside the uncompressed folder, we’re going to compile and install Python 3 with the following commands:

To make sure about the correct installation execute the following command  /opt/python3.3/bin/python3 and you must get something like that:

Also you should have brand new commands like python3  and python3.3. For the Python 3 modules management we’re going to install pip, but for Python 3 :-). Happily there is a debian package for that, use the following command  sudo apt-get install python3-pip to install it.

Finally we’re going to create a virtual environment of Python 3. There is two ways to do that. The first one, the hard way is specifing the python path that you want to use.

The second one, needs a previous installation of the virutalenv module using pip3  sudo pip3 install virtualenv . After you have installed it, to create Python 2 virtual environment you’ll must use virtualenv-2.7 and to create Python 3 virtual environment virtualenv-3.3. Enter the following command to get your Python 3 virtual environment.

Once you’ve created and set your Python 3 virtual environment, you’ll be able to install new modules using the pip command (inside our Python 3 virtual environment pip command is the same as pip-3.3). Let’s test it by installing the django module with  pip install django. After that, run the Python 3 console and write down the following instructions:

I hope it helps you!

Scraping website using Python, Selenium, Lxml and PhantomJS

In this post I’m going to show a basic example of scraping website using Python with the headless browser PhantomJS. In other words, I’m going to automatize the extraction information process from a website using a browser that doesn’t have/need an user interface.

  • The easiest way to work with Python is using virtual environments with virtualenv. In Linux (Debian in my case) insert the following commands to install it.

    Now, go to a directory of your choice, then you must create and set the new virtual environment with the following commands.
  • First test. We need a couple of dependencies for doing the scraping, selenium and lxml. To do that, type down the following commands  pip install selenium and  pip install lxml inside of our virtual environment. If you have problems installing lxml is because you need some dependencies. So you need to erase the virtual environment that you have just created.

    After that you must install the following dependencies, they’re necessaries for compile the lxml module.

    Once you’ve installed it, create again the virtual environment like we did before, and inside of it execute the installation of lxml. If everything going well you should see something like  Successfully installed lxml . Now we are going to test the following code:

    Save it into a file named “test1.py” and execute it inside the virutal environment venv with  python test1.py, you must get the following exit:
  • Second test. Firstly we are going to install PhantomJS following the instructions in phantomjs.org, we create a folder for it and execute

    Now we’re going to modify the source code of the previous example by changing the instantiation of the browser. Firefox() by PhantomJS(). Beside I’ve specified a dimension for the window browser. If the website have a responsive desing maybe you are interest only in some data for one resolution.

    Save the source code into a file “test2.py” and execute it inside the virtual environment with the following command  python test2.py and you must get the same exit but without open a window browser.

 

References