I am still lost. Here is my code:
CloudDeploy[
APIFunction[{"textSample" -> "String", "urlTarget" -> "URL"}, checkURL, "JSON" &];
checkURL = {If[urlTarget != "",
checkSample,
"Empty URL Parameter"]};
checkSample = {If[textSample != "",
func,
"Empty Text Sample Parameter"]};
func = {
excludeList = {"cart", "checkout", "thank-you", "login"} ,
UrlList = Commonest[
DeleteCases[
Import[urlTarget, "HyperLinks"],
{!= urlTarget, excludeList}], 20],
texts = Import /@ urlList,
c = Classify[texts -> urlList],
c[textSample, "Probabilities"]
};
]
I don't know if this is the right way to accomplish my goal.
I want this to be an API where I can pass info to it via parameter and receive a "JSON" or "TEXT" response that includes probability scores.
I want to exclude links that don't contain the urlTarget. This would exclude outbound links, etc. Also I want a to exclude urls that contain any of a list of exclude words. And, I just want to classify the most common appearing links.
I am a little concerned it will try to run a full classification on it every time we hit the API. Is there a way to save a trained model or cache it or something if the URL parameter is the same each time?