Message Boards Message Boards

[LiVE] Latent Semantic Analysis Workflows (WL Live-Stream Series)

Posted 4 years ago

In brief

The lectures on Latent Semantic Analysis (LSA) are to be recorded through Wolfram University (Wolfram U) in December 2019 and January-February 2020.

The lectures

  1. [X] Overview of LSA typical problems and basic workflows.
    Answering preliminary anticipated questions.

    • Here is the recording of the first session at Twitch .
    • What are the typical applications of LSA?
    • Why use LSA?
    • What it the fundamental philosophical or scientific assumption for LSA?
    • What is the most important and/or fundamental step of LSA?
    • What is the difference between LSA and Latent Semantic Indexing (LSI)?
    • What are the alternatives?
    • Using Neural Networks instead?
    • How is LSA used to derive similarities between two given texts?
    • How is LSA used to evaluate the proximity of phrases? (That have different words, but close semantic meaning.)
    • How the main dimension reduction methods compare?
  2. [X] LSA for document collections.
    Here is the recording of the second session at Twitch: https://www.twitch.tv/videos/523306241 .

    • Motivational example -- full blown LSA workflow.

    • Fundamentals, text transformation (the hard way):

      • bag of words model,
      • stop words,
      • stemming.
    • The easy way with LSAMon.

    • "Eat your own dog food" example.

  3. [X] Representation of the documents - the fundamental matrix object.
    Here is the recording of the third session at Twitch: https://www.twitch.tv/videos/533991174 .

    • Review: last session's example.

    • Review: the motivational example -- full blown LSA workflow.

    • Linear vector space representation:

      • LSA's most fundamental operation,
      • matrix with named rows and columns.
    • Pareto Principle adherence

      • for a document,
      • for a document collection, and
      • (in general.)
  4. Representation of unseen documents.
    Here is the recording of the fourth session at Twitch.

  5. LSA for image de-noising and classification.
    Here is the recording of the fifth session at Twitch.

    • Review: last session's image collection topics extraction.
    • Let us try that two other datasets:
    • Image denoising (maybe):

    • Using handwritten digits (again).
    • Image classification:

    • Handwritten digits.
  6. [X] Further use cases. Here is the recording of the sixth session at Twitch.

    • Derive a custom taxonomy over a document collection.
      • Clustering with the reduced dimension.
    • Apply LSA to Great Conversation studies.
    • Use LSA for translation of natural languages.
    • Use LSA for making or improving search engines.
POSTED BY: Anton Antonov
2 Replies

Added the notebook of the 6th live-coding session. (Last for the LSA series.)

POSTED BY: Anton Antonov

Added the notebook of the 5th live-coding session.

POSTED BY: Anton Antonov
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract