Message Boards

WOLFRAM COMMUNITY

26927 Views

8 Replies

14 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Wolfram|Alpha External Programs and Systems

One Way to Query Wolfram Alpha from Python

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 11 years ago

Recently I found this link very useful for python users if they are interestered in the integration of W\|A into their python code. I can quickly explain what this code does and show some modification I have made so that you can interactively use Python as a plain text interface to get information from W\|A. Before you start, you can register for a developer account on the Wolfram Product page. You may also apply for a free and non-commercial use API ID from the W\|A developer portal. You will need this ID to access the database and its XML data structure. Let me first show you how it looks when you run the modified script and after. The example is a "group of 8 " query about an international group myMac: ~ shenghui$ ./WAPy.py 'group of 8' The application then prints the input query and its url encoded form, which just like that you will see in the url bar after you search for this answer in web W\|A interface. It also prints out a list of titles related to the detailed information. They corresponds to the following pods from the outputs through web browser. For instance, the tabular result for Demographics is both available for web query and query via API. The prompt at the bottom is asking for a valid choice from the title list. If you want to check out the demographics info for the member coutries, just go ahead the type the exact match or paste it into the input prompt: A nice feature is added here for searching the pod info again within the current XML session. Because there is a cap for non-commercial/personal developer API ID, I do not want to load the XML again from the server to waste my data cap. Therefore I added this "try again" prompt and I could reuse everything already on the local machine. Let's type y (yes) to get some new piece of data from the XML: Until you hit "n" or any letter other than 'y', you can squeeze the XML to the last drop of data. The python code is like the following: preambles: #!/usr/bin/python from subprocess import call call("clear") import sys import urllib2 import urllib import httplib from xml.etree import ElementTree as etree The body is simply a method call for a wolfram object appid = 'UQ7*********' query = sys.argv[1] print 'I am asking for: ', query w = wolfram(appid) w.search(query) All heavy lifting goes into the definition of the object "wolfram". The constructor takes my API ID/appid as input class wolfram(object): def __init__(self, appid): self.appid = appid self.base_url = 'http://api.wolframalpha.com/v2/query?' self.headers = {'User-Agent':None} The get_xml method uses the urlencode function to turn my W\|A input, aka first argument following executing WAPy.py file. After the call, it gets the XML data from the W\|A server def _get_xml(self, ip): url_params = {'input':ip, 'appid':self.appid} data = urllib.urlencode(url_params) print data req = urllib2.Request(self.base_url, data, self.headers) xml = urllib2.urlopen(req).read() return xml The core of this python application is to extract the titles from the XML data def _xmlparser(self, xml): data_dics = {} tree = etree.fromstring(xml) #retrieving every tag with label 'plaintext' for e in tree.findall('pod'): for item in [ef for ef in list(e) if ef.tag=='subpod']: for it in [i for i in list(item) if i.tag=='plaintext']: if it.tag=='plaintext': data_dics[e.get('title')] = it.text return data_dics If you are not too familiar with the code, that's fine. Basically it first create a python object < treeobj 0xpppppp> from the XML file and looking for proper tags. The following simply creates list from list e with some condition. [ef for ef in list(e) if ef.tag=='subpod'] specifically the condition is that the the XML tag must be like <subpod> </subpod>. Finally the human-readable data is stored in the data_dics. This is a special data structure in python called dictionary. Nothing fancy but just create a mapping like list. e.g. a = ['name': 'wolfram research', 'product':"wolfram alpha"] and a['name'] returns 'wolfram research'. Notice that you cannot use integer index here for a. The last part is about waiting for my input and print out the result under the chosen pod tile. This method calls two methods of wolfram object before to get "xml" and "result_dics". def search(self, ip): xml = self._get_xml(ip) result_dics = self._xmlparser(xml) print 'Available Titles', '\n' titles = dict.keys(result_dics) for ele in titles : print '\t' + ele print '\n' tryAgain = 'y' while tryAgain == 'y': s = raw_input('Choose Pod Title: (type quit to terminate) ') if s == 'quit': quit() while (s not in titles): if s == 'quit': quit() print 'Not Valid Title' s = raw_input('Choose Pod Title Again: ') print result_dics[s] tryAgain = raw_input('\nTry other pod title(y/n): ') print '\nTerminate the query' To print out the pod titles after I have the dictionary object, I use the dict.keys method. Shortly, it does the following: a = ['name': 'wolfram research', 'product':"wolfram alpha"] dict.keys(a) ====> ['name', 'product'] The for loop exhausts the list of pod titles and prints them out. for ele in titles : print '\t' + ele The next While loop's purpose is to wait the user choose to quit the progam and the inner loop checks if the user has typeda valid pod title from the list of available pod titles. Once you have the code saved to a .py file, use chmod a+x myfile.py to make it an executable. Of course this is just a very simple application about the embedding W\|A into python code, yet this shows a basic collection of essential elements one need to complete such a task. For a thorough manual about the XML structure, please go to this site.

POSTED BY: Shenghui Yang

8 Replies

Sort By:

Nikhila B.S.

Posted 11 years ago

Thanks a lot! Since I am a beginner in Python, I did not know the use of self. I read up about it and I realized it was a separate method. Could you please tell me what the method does? (I'm very sorry, I'm very new to Python and using WA with Python) Thank you very much :)

POSTED BY: Nikhila B.S.

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 11 years ago

You are very welcome. Once you stick with python more often, you will learn a lot about this language. Then you will find more ways to use the Wolfram Alpha API.

POSTED BY: Shenghui Yang

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 11 years ago

You just need to update the $ \_xmlparser$ method with the following def _xmlparser(self, xml): data_dics = {} tree = etree.fromstring(xml) #retrieving every tag with label 'plaintext' for e in tree.findall('pod'): for item in [ef for ef in list(e) if ef.tag=='subpod']: for it in [i for i in list(item) if i.tag=='plaintext']: if it.tag=='plaintext': mykey = e.get('title') if mykey not in data_dics.keys(): data_dics[mykey] = [it.text] else: prev = data_dics[mykey] data_dics[mykey] = prev + [it.text] return data_dics Thanks for the test query. There was a bug in the original code because the dict object in python only use the latest value for a given key, say a["key1"] =3 a["key2"] =4 # a is {"key1":3 , "key2":4} if assign the "key1" again with 6, you should have a["key1"] =6 # a is updated to {"key1":6 , "key2":4} for multiple value of a given key, we need to use a list value instead. After update and you should be able to get all result:

You just need to update the $ \_xmlparser$ method with the following

def _xmlparser(self, xml):
     data_dics = {}
     tree = etree.fromstring(xml)
     #retrieving every tag with label 'plaintext'
     for e in tree.findall('pod'):
         for item in [ef for ef in list(e) if ef.tag=='subpod']:
             for it in [i for i in list(item) if i.tag=='plaintext']:
                 if it.tag=='plaintext':
                     mykey = e.get('title')
                     if mykey not in data_dics.keys():
                          data_dics[mykey] = [it.text]
                     else:
                         prev = data_dics[mykey]
                         data_dics[mykey] = prev + [it.text]
    return data_dics

Thanks for the test query. There was a bug in the original code because the dict object in python only use the latest value for a given key, say

a["key1"] =3
a["key2"] =4
# a is {"key1":3 , "key2":4}

if assign the "key1" again with 6, you should have

a["key1"] =6 
# a is updated to {"key1":6 , "key2":4}

for multiple value of a given key, we need to use a list value instead. After update and you should be able to get all result: listRes

POSTED BY: Shenghui Yang

Nikhila B.S.

Posted 11 years ago

Thank you very much. But I'm getting this error : 'wolfram' object has no attribute '_WAUnicodeCvtr' I tried Googling the error (excluding the 'wolfram', of course), but I did not find anything. Is there anything extra I have to import for this? Or is there any step I have missed? Thanks in advance.

POSTED BY: Nikhila B.S.

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 11 years ago

@Nikhila, I have updated the code. I defined `_WAUnicodeCvtr` myself for non-unicode input and I have removed the function in the last piece of code.

POSTED BY: Shenghui Yang

Nikhila B.S.

Posted 11 years ago

Thanks a lot! I had been looking for this for a long time. I just have one issue. If I have a Quadratic Equation in the query, the 'Solutions' key holds only one value of the unknown. How do I get BOTH the values of the unknown? Ex : Query : Solve x^2+3*x+2=0 (Actual solutions : x=-1 and x=-2) But value of the 'Solutions' key is only x=-1 Please help me. Thanks in advance.

POSTED BY: Nikhila B.S.

Shenghui Yang

Shenghui Yang, WOLFRAM

Posted 11 years ago

@Carrettie, I do not have luck with the last three options in your list. They either prints out timeout error or some package not being supported. I will go with the actual cloud platform such as AWS to run the command remotely. Also there are several issues with the original code: 1. sys.argv[0] should move one step ahead after import sys 2. result_dics['Result'] does not work as a general option because not all queries return a "Result" title Just FYI. Your pasted code should be left as is. .

POSTED BY: Shenghui Yang

Sam Carrettie

Sam Carrettie, Freelancer

Posted 11 years ago

This great - thanks for sharing! I winder if this will work with cloud / online Python - similar to these sites: https://ideone.com http://www.skulpt.org http://repl.it/languages/Python http://www.compileonline.com/execute_python_online.php Just in case I also give a full code from that site you link to - to keep the record - sites come and go. import urllib2 import urllib import httplib from xml.etree import ElementTree as etree class wolfram(object): def __init__(self, appid): self.appid = appid self.base_url = 'http://api.wolframalpha.com/v2/query?' self.headers = {'User-Agent':None} def _get_xml(self, ip): url_params = {'input':ip, 'appid':self.appid} data = urllib.urlencode(url_params) req = urllib2.Request(self.base_url, data, self.headers) xml = urllib2.urlopen(req).read() return xml def _xmlparser(self, xml): data_dics = {} tree = etree.fromstring(xml) #retrieving every tag with label 'plaintext' for e in tree.findall('pod'): for item in [ef for ef in list(e) if ef.tag=='subpod']: for it in [i for i in list(item) if i.tag=='plaintext']: if it.tag=='plaintext': data_dics[e.get('title')] = it.text return data_dics def search(self, ip): xml = self._get_xml(ip) result_dics = self._xmlparser(xml) #return result_dics #print result_dics print result_dics['Result'] if __name__ == "__main__": appid = sys.argv[0] query = sys.argv[1] w = wolfram(appid) w.search(query)

POSTED BY: Sam Carrettie

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Group Abstract

Feedback