Message Boards Message Boards

Simple Custom Search Engine

Posted 5 years ago

I wanted to play around with the DuckDuckGo Instant Answer API, and ended up creating a cloud form that uses the API in the backend to perform searches, the type of search determined by the particular query case. This is that process.

The API

The API isn't a full results API. Instead, it makes use of the "Instant Answer" service: it collates top results from various sources and provides the answer before results appear, at the top of the browser. This means less time spent sifting through queries.

The main thing we care about are the return fields, all of which are listed on the doc page: https://duckduckgo.com/api

The API is accessed directly:

"https://api.duckduckgo.com/?q=" <> q <> "&format=json"

Where q is whatever you're interested in searching.

Let's import a test query and look at the available return fields:

rawData = Import["https://api.duckduckgo.com/?q=milk&format=json", "RawJSON"];
rawData // Keys
(*{"ImageHeight", "Entity", "AnswerType", "AbstractText", "Definition", \
"DefinitionSource", "Answer", "DefinitionURL", "meta", "ImageIsLogo", \
"Image", "Infobox", "Results", "ImageWidth", "Heading", \
"AbstractURL", "Redirect", "Type", "Abstract", "AbstractSource", \
"RelatedTopics"}*)

Let's look at any direct hits:

rawData["Results"]
(*{}*)

Any related topics available?

rawData["RelatedTopics"] // Keys
(*{{"FirstURL", "Result", "Text", "Icon"}, {"Result", "FirstURL", 
  "Icon", "Text"}, {"Icon", "Text", "Result", "FirstURL"}, {"Topics", 
  "Name"}, {"Topics", "Name"}, {"Topics", "Name"}, {"Topics", 
  "Name"}, {"Topics", "Name"}}*)

Let's look at the first element:

First[rawData["RelatedTopics"]]
(*<|"FirstURL" -> "https://duckduckgo.com/Milk", 
 "Result" -> 
  "<a href=\"https://duckduckgo.com/Milk\">Milk</a> A white liquid \
produced by the mammary glands of mammals. It is the primary source \
of...", "Text" -> 
  "Milk A white liquid produced by the mammary glands of mammals. It \
is the primary source of...", 
 "Icon" -> <|"URL" -> "https://duckduckgo.com/i/2abefe39.jpg", 
   "Height" -> "", "Width" -> ""|>|>*)

Extract the FirstURL:

First[rawData["RelatedTopics"]]["FirstURL"]
(*"https://duckduckgo.com/Milk"*)

Constructing the Query Function

The above case illustrates the times when a query won't have a direct hit, and the most pertinent URL will redirect to the search results page produced by DuckDuckGo. This is one of three cases I've identified, and each case can be accounted for with the conditional Which:

Which[
 rawData["meta"] === Null, "The search cannot be completed.", (*first test*)

 rawData["Results"] === {}, (*second test*)
 First[rawData["RelatedTopics"]]["FirstURL"],

 True, First[rawData["Results"]]["FirstURL"](*third test*)
 ]

With the addition of correcting for multiple words in a query, and the proper redirection after execution, we have enough info to construct the function:

queryFunc[q_String] := Module[{search, api, import, link, request},
  search = StringReplace[q, WhitespaceCharacter -> "+"];
  api = "https://api.duckduckgo.com/?q=" <> search <> "&format=json";
  import = Import[api, "RawJSON"];
  link = Which[
    import["meta"] === Null, CloudObject["redirectPage"],(*this object will be constructed in the next section*)
    import["Results"] === {}, 
    First[import["RelatedTopics"]]["FirstURL"],
    True, First[import["Results"]]["FirstURL"]
    ];
  request = Delayed[HTTPRedirect[link]]
  ]

Creating the form and redirect page

Now that the function is created, we can focus on the form which will be created with FormFunction:

form = FormFunction[

  "query" -> <|
    "Hint" -> "What would you like to know?",
    "Interpreter" -> "String",
    "Label" -> None
    |>,

  queryFunc[#query] &,

  AppearanceRules -> <|
    "Title" -> "Custom Search Engine",
    "Description" -> 
     "Use the DuckDuckGo Instant Answer API to get answers for your \
searches without always needing to click on a result.",
    "SubmitLabel" -> "Search"
    |>
  ]

Additionally, for the first query case where the metainformation field is Null, we need a proper redirect to let the user know their search cannot be carried out:

redirectPage = 
 ExportForm[TextCell["The search could not be completed.", "Title"], 
  "CloudCDF"]

Deploying to the Cloud

First, deploy the redirect page for the first query case:

CloudDeploy[redirectPage, "redirectPage", Permissions -> "Public"];

enter image description here

Then deploy the search form:

CloudDeploy[form, "searchForm", Permissions -> "Private" (*change permissions as desired*)]

enter image description here

Next Steps

I put all the relevant functions in a small repo on GitHub, which you can find here: https://github.com/jldohmann/custom-search-engine

I would like to eventually (when I find the time) to write a wrapper for the API in WL. There are several others that already exist.

POSTED BY: Jesse Dohmann
2 Replies

enter image description here - Congratulations! This post is now featured in our Staff Pick column as distinguished by a badge on your profile of a Featured Contributor! Thank you, keep it coming, and consider contributing your work to the The Notebook Archive!

POSTED BY: EDITORIAL BOARD

Thank you for that information Jesse

Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract