search_google

Richard Wen

A command line tool and module for Google API web and image search.

Tested for Python 2.6 and 3.5 using Anaconda 4.3.1.

search_google -h
search_google cats
search_google cats --searchType=image

Install

  1. Install Python
  2. Install search_google via pip
pip install search_google

Setup

  • A CSE ID and a Google API developer key are required to use this package
  • A Gmail account will also be required to create and access the ID and developer key
  • When asked to sign in, use your Gmail account for access

Note: Instructions and links were written on May 20, 2017, and are subject to change depending on Google’s website and API.

Google Custom Search Engine

A Google Custom Search Engine (CSE) and a CSE ID can be setup with the following instructions:

  1. Go to the CSE Control Panel
  2. Click Add
  3. Enter a website in the box under Sites to search such as “www.google.com”
  4. Click Create
  5. Go back to the CSE Control Panel
  6. Select your created search engine
  7. Turn on Image search
  8. For Sites to search, select Search the entire web but emphasize included sites
  9. Under Sites to search, click checkbox next to Site
  10. Under Sites to search, click Delete
  11. Under Details, click Search engine ID
  12. Set cx by replacing “your_cse_id” with the Search engine ID
search_google -s cx="your_cse_id"

Google API

An API developer key for the Google Application Programming Interface (API) can be setup with the following instructions:

  1. Enable Google Custom Search Engine
  2. Go to Google API Console Credentials
  3. Click Create Credentials -> API Key
  4. Set build_developerKey by replacing “your_dev_key” with the API Key
search_google -s build_developerKey="your_dev_key"

Usage

For help in the console, use:

search_google -h

Please ensure that the Setup section was completed:

search_google -s cx="your_cse_id"
search_google -s build_developerKey="your_dev_key"

Default Arguments

Default arguments persist even after the console is closed. Defaults enable user customization of the search_google command without a long list of arguments every call.

View the defaults:

search_google -v

Increase number of search results previewed to 20:

search_google -s option_preview=20

Turn off preview of search results:

search_google -s option_silent=True

Set the searchType argument to default to image search:

search_google -s searchType=image

Set the fileType argument to default to jpg images:

search_google -s fileType=jpg

Set to save a text file named links.txt with search result links whenever used:

search_google -s save_links=links.txt

Remove default arguments:

search_result -r option_preview
search_google -r option_silent
search_google -r searchType
search_google -r fileType
search_google -r save_links

Reset the defaults:

search_google -d

After resetting defaults, the developer and CSE ID keys will have to be set again:

search_google -s cx="your_cse_id"
search_google -s build_developerKey="your_dev_key"

Additional Arguments

A number of optional arguments defined using -- are not shown when using search_google -h. These can be used with the same names as the arguments passed to Google’s CSE method:

search_google -a

For example, the index of the first result to return can be set by argument start which is a named argument in Google’s CSE method:

search_google cat --start=2
search_google cat --lr=lang_en
search_google cat --searchType=image --imgType=photo
search_google cat --searchType=image --imgDominantColor=brown

Module Import

The search_google package may also be used as a Python module:

import search_google.api

# Define buildargs for cse api
buildargs = {
  'serviceName': 'customsearch',
  'version': 'v1',
  'developerKey': 'your_api_key'
}

# Define cseargs for search
cseargs = {
  'q': 'keyword query',
  'cx': 'your_cse_id',
  'num': 3
}

# Create a results object
results = search_google.api.results(buildargs, cseargs)

For more details on module usage, see the example in api.

Modules

api

class api.results(buildargs={'serviceName': 'customsearch', 'version': 'v1'}, cseargs={'num': 3, 'fileType': 'png'})[source]

Google Custom Search Engine (CSE) API results.

Uses the Google Custom Search Engine API to search webpages and images using queries.

Args:
buildargs (dict):
Named arguments for googleapiclient.build.
cseargs (dict):
Named arguments for cse.list.
Attributes:
metadata (dict):
object returned from cse.list.
buildargs (dict):
Same as argument buildargs for reference of inputs.
csedargs (dict):
Same as argument cseargs for reference of inputs.
Examples:
# Import the api module for the results class
import search_google.api

# Define buildargs for api api
buildargs = {
  "serviceName": "customsearch",
  "version": "v1",
  "developerKey": "your_api_key"
}

# Define cseargs for search
cseargs = {
  "q": "keyword query",
  "cx": "your_cse_id",
  "num": 3
}

# Create a results object
results = search_google.api.results(buildargs, cseargs)

# Preview the search results
results.preview()

# Obtain the url links from the search
# Links are inside results['items'] list
links = results.get_values('items', 'link')

# Obtain the url links from the search
links = results.links

# Save the search result metadata to a json file
results.save_metadata('metadata.json')

# Save the search result links to a text file
results.save_links('links.txt')

# Download the search results to a directory
results.download_links('downloads')

Download web pages or images from search result links.

Args:
dir_path (str):
Path of directory to save downloads of api.results.links
get_values(k, v)[source]

Get a list of values from the key value metadata attribute.

Args:
k (str):
Key in api.results.metadata
v (str):
Values from each item in the key of api.results.metadata
Returns:
A list containing all the v values in the k key for the api.results.metadata attribute.

list of str: Web links to search results using api.results.get_values().

preview(n=10, k='items', kheader='displayLink', klink='link', kdescription='snippet')[source]

Print a preview of the search results.

Args:
n (int):
Maximum number of search results to preview
k (str):
Key in api.results.metadata to preview
kheader (str):
Key in api.results.metadata[k] to use as the header
klink (str):
Key in api.results.metadata[k] to use as the link if image search
kdescription (str):
Key in api.results.metadata[k] to use as the description

Saves a text file of the search result links.

Saves a text file of the search result links, where each link is saved in a new line. An example is provided below:

http://www.google.ca
http://www.gmail.com
Args:
file_path (str):
Path to the text file to save links to.
save_metadata(file_path)[source]

Saves a json file of the search result metadata.

Saves a json file of the search result metadata from api.results.metadata.

Args:
file_path (str):
Path to the json file to save metadata to.

cli

cli.run(argv=['C:\\Tools\\Anaconda3\\Scripts\\sphinx-build-script.py', '-b', 'html', 'docs/source', 'docs'])[source]

Runs the search_google command line tool.

This function runs the search_google command line tool in a terminal. It was intended for use inside a py file (.py) to be executed using python.

Notes:
  • [q] reflects key q in the cseargs parameter for api.results
  • Optional arguments with build_ are keys in the buildargs parameter for api.results

For distribution, this function must be defined in the following files:

# In 'search_google/search_google/__main__.py'
from .cli import run
run()

# In 'search_google/search_google.py'
from search_google.cli import run
if __name__ == '__main__':
  run()

# In 'search_google/__init__.py'
__entry_points__ = {'console_scripts': ['search_google=search_google.cli:run']}

Examples:

# Import google_streetview for the cli module
import search_google.cli

# Create command line arguments
argv = [
  'cli.py',
  'google',
  '--searchType=image',
  '--build_developerKey=your_dev_key',
  '--cx=your_cx_id'
  '--num=1'
]

# Run command line
search_google.cli.run(argv)