Python: Beautiful Soupのインストール – Mac OS編

インストール方法自体は配布元に記載の方法で実施。

参考サイト

  1. Beautiful Soup: We called him Tortoise because he taught us.
  2. Beautiful Soup Documentation — Beautiful Soup 4.0.0 documentation

The current release is Beautiful Soup 4.1.3 (August 20, 2012). You can install it with pip install beautifulsoup4 or easy_install beautifulsoup4. It’s also available as the python-beautifulsoup4 package in recent versions of Debian and Ubuntu.

インストール方法

$ sudo easy_install beautifulsoup4
Password:
Searching for beautifulsoup4
Reading https://pypi.org/simple/beautifulsoup4/
Reading http://www.crummy.com/software/BeautifulSoup/bs4/
Reading http://www.crummy.com/software/BeautifulSoup/bs4/download/
Best match: beautifulsoup4 4.1.3
Downloading 
Processing beautifulsoup4-4.1.3.tar.gz
Running beautifulsoup4-4.1.3/setup.py -q bdist_egg --dist-dir /tmp/easy_install-vQ3gxL/beautifulsoup4-4.1.3/egg-dist-tmp-QEs1O5
zip_safe flag not set; analyzing archive contents...
Adding beautifulsoup4 4.1.3 to easy-install.pth file
Installed /Library/Python/2.7/site-packages/beautifulsoup4-4.1.3-py2.7.egg
Processing dependencies for beautifulsoup4
Finished processing dependencies for beautifulsoup4
$

このインストール方法を用いた場合は、下記のディレクトリにインストールされ、サーチパスリスト(sys.path)にも追加される。

/Library/Python/2.7/site-packages/beautifulsoup4-4.1.3-py2.7.egg

>>> import sys
>>> print sys.path
['', '/Library/Python/2.7/site-packages/beautifulsoup4-4.1.3-py2.7.egg',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python27.zip',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-darwin',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/plat-mac/lib-scriptpackages',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-tk',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-old',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/lib-dynload',
'/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages',
'/Library/Python/2.7/site-packages']

使用方法

>>> from urllib import urlopen
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(urlopen('http://www.yukun.info'))
>>> print soup.title
<title>Yukun's Blog</title>
>>> print soup.title.name
title
>>> print soup.title.string
Yukun's Blog
>>> for link in soup.find_all('a'):
...     print(link.get('href'))
...
https://yukun.info/
#content
https://yukun.info/
http://www.yukun.info/about

...