WOLFRAM MATERIALS for the ARTICLE:
William J Turkel, Digital Research Methods with Mathematica.
https://williamjturkel.net/digital-research-methods-with-mathematica/
Full book in notebook
GitHub Repository
This summer (2020) I revised the second edition of my open content, open access and open source textbook Digital Research Methods with Mathematica. It is freely available as a Mathematica notebook which can be accessed with either the Mathematica software or with Wolfram's free Wolfram Player. The content remains mostly the same as the previous version with a few minor edits. New in this edition are more than 100 short screencasts which I created to support online learning during the COVID-19 pandemic.
https://williamjturkel.net/digital-research-methods-with-mathematica
The book focuses on learning to read code to the point where one can modify it to solve related research problems. Here are the topics that are covered.
- Introduction to Mathematica. Interacting with notebooks.
- Reading Code. Word frequency, word clouds and stopwords.
- Computable Knowledge. Entities, tables, timelines and maps.
- Text Content. Mathematica notebooks and expressions, strings and natural language processing.
- Data Structures. Lists, associations and datasets.
- Reusing Code. Defining and developing functions, keyword in context (KWIC).
- Networks. Metadata, matrices and social network analysis.
- Indexing and Searching. Pattern matching, topic classification and term distribution.
- Geospatial Analysis. Geographic information: raster, vector and attribute data.
- Images. Computer vision, face detection, feature extraction and image mining.
- Page Images. Optical character recognition (OCR), figure extraction and classification.
- Crawling. Browser automation, batch downloading, web archives and WARC files.
- Linked Open Data. Resource description framework (RDF), SPARQL queries and endpoints, JSON-LD.
- Markup Languages. Scraping and parsing, XML, really simple syndication (RSS) and text encoding initiative (TEI).
- Studying Societies. Computational social science, search data, social media and social networks.
- Extracting Keywords. Information retrieval, term frequency-inverse document frequency (TF-IDF) and rapid automatic keyword extraction (RAKE).
- Word and Document Vectors. Feature extraction, dimension reduction, word embeddings and global vectors.
- References, web services, bibliographic linked open data and citation networks.
- Natural Language. Multilingual analysis, computational linguistics and sentiment analysis.
- Web Services. Entity networks, publication search, dashboards, manipulating JSON.
- Databases. Parts, selections and transformations, computations and querying, relations.
- Measuring Images. Photogrammetry, georectification, handwriting and facial 3D reconstruction.
- Machine Learning. Unsupervised clustering, classify, predict and transfer learning.
SAMPLE PAGES: Lesson 21.3. Measuring Images: Handwriting