WolframAlpha.com
WolframCloud.com
All Sites & Public Resources...
Products & Services
Wolfram|One
Mathematica
Wolfram|Alpha Notebook Edition
Programming Lab
Finance Platform
SystemModeler
Wolfram Player
Wolfram Engine
WolframScript
Enterprise Private Cloud
Enterprise Mathematica
Wolfram|Alpha Appliance
Enterprise Solutions
Corporate Consulting
Technical Consulting
Wolfram|Alpha Business Solutions
Resource System
Data Repository
Neural Net Repository
Function Repository
Wolfram|Alpha
Wolfram|Alpha Pro
Problem Generator
API
Data Drop
Products for Education
Mobile Apps
Wolfram Player
Wolfram Cloud App
Wolfram|Alpha for Mobile
Wolfram|Alpha-Powered Apps
Services
Paid Project Support
Wolfram U
Summer Programs
All Products & Services »
Technologies
Wolfram Language
Revolutionary knowledge-based programming language.
Wolfram Cloud
Central infrastructure for Wolfram's cloud products & services.
Wolfram Science
Technology-enabling science of the computational universe.
Wolfram Notebooks
The preeminent environment for any technical workflows.
Wolfram Engine
Software engine implementing the Wolfram Language.
Wolfram Natural Language Understanding System
Knowledge-based broadly deployed natural language.
Wolfram Data Framework
Semantic framework for real-world data.
Wolfram Universal Deployment System
Instant deployment across cloud, desktop, mobile, and more.
Wolfram Knowledgebase
Curated computable knowledge powering Wolfram|Alpha.
All Technologies »
Solutions
Engineering, R&D
Aerospace & Defense
Chemical Engineering
Control Systems
Electrical Engineering
Image Processing
Industrial Engineering
Mechanical Engineering
Operations Research
More...
Finance, Statistics & Business Analysis
Actuarial Sciences
Bioinformatics
Data Science
Econometrics
Financial Risk Management
Statistics
More...
Education
All Solutions for Education
Trends
Machine Learning
Multiparadigm Data Science
Internet of Things
High-Performance Computing
Hackathons
Software & Web
Software Development
Authoring & Publishing
Interface Development
Web Development
Sciences
Astronomy
Biology
Chemistry
More...
All Solutions »
Learning & Support
Learning
Wolfram Language Documentation
Fast Introduction for Programmers
Wolfram U
Videos & Screencasts
Wolfram Language Introductory Book
Webinars & Training
Summer Programs
Books
Need Help?
Support FAQ
Wolfram Community
Contact Support
Premium Support
Paid Project Support
Technical Consulting
All Learning & Support »
Company
About
Company Background
Wolfram Blog
Events
Contact Us
Work with Us
Careers at Wolfram
Internships
Other Wolfram Language Jobs
Initiatives
Wolfram Foundation
MathWorld
Computer-Based Math
A New Kind of Science
Wolfram Technology for Hackathons
Student Ambassador Program
Wolfram for Startups
Demonstrations Project
Wolfram Innovator Awards
Wolfram + Raspberry Pi
Summer Programs
More...
All Company »
Search
WOLFRAM COMMUNITY
Connect with users of Wolfram technologies to learn, solve problems and share ideas
Join
Sign In
Dashboard
Groups
People
Message Boards
Answer
(
Unmark
)
Mark as an Answer
GROUPS:
Wolfram Language
Computational Linguistics
Wolfram High School Summer Camp
1
Noemi Chulo
[WSC20] Studying Rates of Language Change
Noemi Chulo, Student
Posted
10 months ago
963 Views
|
0 Replies
|
1 Total Likes
Follow this post
|
Abstract
In this project I attempt to better understand the rate of change of languages over time by assigning an estimated volatility score to each language through a series of functions and comparing these languages according to various metrics to see if we can find reason for their differing scores.
Introduction
I wanted to computationally classify languages by volatility and use this to better understand what makes languages more or less volatile. Being able to classify languages by their volatility allows us to better understand the underlying moving forces of natural language.
Initial Observations
At the beginning of this project, I went through a different words manually using TimeSeries of WordFrequencyData and saw noticeable changes throughout the 300 years of data. However, I noticed that older words have much lower word frequency values even when they were at the height of their use. (Note: I use
MovingAverage[]
on all graphs in this presentation to make data more fluid. This might mean that interpretations of data can be inaccurate at times.)
G
r
a
p
h
s
(
“
c
a
t
”
v
s
.
“
c
a
t
t
”
)
We see that the “death” of the Middle English noun “catt” corresponds with the rise of the Modern English noun “cat”. We also see that “catt” was used about 25x less than “cat” even at its height of usage. After further research, I found a number of reasons for why this could be the case. Most notably, unconventional spelling in English. Conventional spelling only started to be implemented in English in the 1500’s, and even then it generally only applied to a small minority of the population who both could read and speak Latin. These spellings also often tweaked English words slightly to sound more like Latin. An example is, the spelling of “Falcon” instead of the then used “Faucon”. Today, we use the word “Falcon” and pronounce it with the “L”, but many of these changes were unprecedented and confusing at the time. This lead them to being largely disregarded even into the late 1700’s and early 1800’s. For this reason, it’s very likely that there were many different spellings and ways of saying “cat” in the 1700’s, which would explain what appears to be a period of time where people just didn’t write about cats that much. Another explanation could just be that people didn’t write about cats much in the 1700’s. I did notice this pattern in most words I compared, so it’s an important thing to keep in mind. Taking this into account and other logistical reasons, I decided to not implement old versions of words in any language for my investigation.
Another thing to keep in mind is WordFrequencyData’s way of displaying time data. For words with constant data from 1701-2008, the time data is expressed as six values that instruct the function to create its own internal list with Unix Times when it needs to be graphed. Initially, I used UnixTime[] to find these values (as of this writing
UnixTime[]
is buggy), but I then found that I can fetch the data using
[“Times”]
. Keep this in mind if you plan on using
WordFrequencyData
.
Choosing Word Lists
Choosing words to check volatility score is an important job, because I want a large volume of words to test, and to not bias particular languages. For this reason, I attempted to avoid cultural or product related words. I do recognize that having many different words makes the chance of bias decrease. I chose 151 words that I saw as somewhat constant to the human experience. Of course, my choices are subjective as to what I think are constant to the human experience, so I do not claim to say that these words are the ideal choice. I sorted these words into groups by what context I think it might be used. For example, although the word “war” is technically an abstract concept that doesn’t represent a particular object, I did not put it in the “abstract conceptual” group. Instead I put it in the “human experience” group, because I thought it might be used more often alongside references to civilization, than concepts like happiness and time. In a future iteration I would like to find a way to group words computationally, instead of by assumption. Below is my list of words and their groups.
DISCLAIMER:
You can’t choose to look at the difference between particular sorted groups. You can select a particular word group to use for `wordlist` and then run the implemented function.
W
o
r
d
L
i
s
t
s
Calculating Volatility
First thing’s first, getting the word data. This function gets the data from
WordFrequencyData
and then we can assign it to a variable.
g
e
t
w
o
r
d
d
a
t
a
[
w
o
r
d
_
,
l
a
n
g
u
a
g
e
_
]
:
=
W
o
r
d
F
r
e
q
u
e
n
c
y
D
a
t
a
[
w
o
r
d
,
"
T
i
m
e
S
e
r
i
e
s
"
,
I
g
n
o
r
e
C
a
s
e
F
a
l
s
e
,
L
a
n
g
u
a
g
e
l
a
n
g
u
a
g
e
]
I
n
[
]
:
=
w
o
r
d
d
a
t
a
=
g
e
t
w
o
r
d
d
a
t
a
[
"
g
a
u
s
s
"
]
;
I
n
[
]
:
=
Now I take the list of times and corresponding frequencies, and calculate the moving average for the frequencies and times to make the graph more fluid.
t
i
m
e
s
=
w
o
r
d
d
a
t
a
[
"
T
i
m
e
s
"
]
;
I
n
[
]
:
=
f
r
e
q
s
=
w
o
r
d
d
a
t
a
[
"
V
a
l
u
e
s
"
]
;
I
n
[
]
:
=
g
e
t
f
u
l
l
d
a
t
a
[
w
o
r
d
d
a
t
a
_
,
m
a
v
a
l
_
]
:
=
M
o
v
i
n
g
A
v
e
r
a
g
e
[
w
o
r
d
d
a
t
a
,
m
a
v
a
l
]
;
t
i
m
e
s
M
A
c
a
l
c
[
t
i
m
e
s
_
,
m
a
v
a
l
_
]
:
=
M
o
v
i
n
g
A
v
e
r
a
g
e
[
t
i
m
e
s
,
m
a
v
a
l
]
;
f
r
e
q
M
A
c
a
l
c
[
f
r
e
q
s
_
,
m
a
v
a
l
_
]
:
=
M
o
v
i
n
g
A
v
e
r
a
g
e
[
f
r
e
q
s
,
m
a
v
a
l
]
;
t
i
m
e
s
M
A
=
t
i
m
e
s
M
A
c
a
l
c
[
t
i
m
e
s
,
2
9
]
;
I
n
[
]
:
=
f
r
e
q
s
M
A
=
f
r
e
q
M
A
c
a
l
c
[
f
r
e
q
s
,
2
9
]
;
I
n
[
]
:
=
Now we start on getting the Practical Derivative.
finalcalc[]
will format our data in a pretty way that allows us to take the derivatives for time and frequency.
The output will be
{{time_1, freq_1}, {time_2, freq_2}, ......}
Dividem[]
is a simple slope-finding function. It takes two points and finds the slope between them. Each slope that
Dividem[]
outputs will be a point in our final list of derivative values.
DiviX[]
and the functions following it will select the correct value for each position in the
Dividem[]
function from
final
.
Then
fullfunction[]
will implement
Dividem[]
and the position functions so that it can be used on our data values.
Runthrough[]
will run
fullfunction[]
on the entire list, and output our final list of derivatives.
f
i
n
a
l
c
a
l
c
[
t
i
m
e
s
M
A
_
,
f
r
e
q
s
M
A
_
]
:
=
T
r
a
n
s
p
o
s
e
[
{
t
i
m
e
s
M
A
,
f
r
e
q
s
M
A
}
]
;
D
i
v
i
d
e
m
[
x
_
,
x
1
_
,
y
_
,
y
1
_
]
:
=
(
y
1
-
y
)
/
(
x
1
-
x
)
D
i
v
i
X
[
n
u
m
_
,
f
i
n
a
l
_
]
:
=
f
i
n
a
l
[
[
n
u
m
,
1
]
]
D
i
v
i
X
1
[
n
u
m
_
,
f
i
n
a
l
_
]
:
=
f
i
n
a
l
[
[
n
u
m
+
1
,
1
]
]
D
i
v
i
Y
[
n
u
m
_
,
f
i
n
a
l
_
]
:
=
f
i
n
a
l
[
[
n
u
m
,
2
]
]
D
i
v
i
Y
1
[
n
u
m
_
,
f
i
n
a
l
_
]
:
=
f
i
n
a
l
[
[
n
u
m
+
1
,
2
]
]
f
u
l
l
f
u
n
c
t
i
o
n
[
i
n
d
e
x
_
,
f
i
n
a
l
_
]
:
=
D
i
v
i
d
e
m
[
D
i
v
i
X
[
i
n
d
e
x
,
f
i
n
a
l
]
,
D
i
v
i
X
1
[
i
n
d
e
x
,
f
i
n
a
l
]
,
D
i
v
i
Y
[
i
n
d
e
x
,
f
i
n
a
l
]
,
D
i
v
i
Y
1
[
i
n
d
e
x
,
f
i
n
a
l
]
]
I
n
[
]
:
=
f
i
n
a
l
=
f
i
n
a
l
c
a
l
c
[
t
i
m
e
s
M
A
,
f
r
e
q
s
M
A
]
;
I
n
[
]
:
=
R
u
n
t
h
r
o
u
g
h
[
t
i
m
e
s
M
A
_
,
f
i
n
a
l
_
]
:
=
M
a
p
[
f
u
l
l
f
u
n
c
t
i
o
n
[
#
,
f
i
n
a
l
]
&
,
D
r
o
p
[
R
a
n
g
e
[
L
e
n
g
t
h
[
t
i
m
e
s
M
A
]
]
,
-
1
]
]
I
n
[
]
:
=
R
u
n
t
h
r
o
u
g
h
[
t
i
m
e
s
M
A
,
f
i
n
a
l
]
;
I
n
[
]
:
=
Since we now have our derivative values, we can start to analyze what they mean. This part of the code will check for zeroes in the derivative.
It will do this by assigning each derivative value in the list into its own list with its neighbor. I.E.:
{a,b,c,d,e...}
will have new neighbors list
{{a,b},{b,c},{c,d},{d,e}.....}
.
Then
CheckMinMax[]
will check to see the sign of each member of each list. This relies on the Intermediate Value Theorem, which states that if the signs of two points on a continuous function are different, there is a zero value between them. All lists where both elements are positive or both negative will be recorded as “nan”. All lists where the first value is positive and the second is negative are recorded as “1”, which signifies a maximum in the original data at that point, and all lists where the opposite is true are recorded as “-1” which signifies a minimum at that point.
CheckALL[]
will then map
CheckMinMax[]
to every sublist and output a list we can use to see where the minimums and maximums are.
C
o
m
p
D
e
r
i
v
D
a
t
a
[
t
i
m
e
s
M
A
_
,
m
a
D
E
R
v
a
l
_
,
f
i
n
a
l
_
]
:
=
M
o
v
i
n
g
A
v
e
r
a
g
e
[
T
r
a
n
s
p
o
s
e
[
{
D
r
o
p
[
t
i
m
e
s
M
A
,
-
1
]
,
R
u
n
t
h
r
o
u
g
h
[
t
i
m
e
s
M
A
,
f
i
n
a
l
]
}
]
,
m
a
D
E
R
v
a
l
]
D
i
v
i
1
Y
[
D
e
r
i
v
d
a
t
a
_
,
n
u
m
_
]
:
=
D
e
r
i
v
d
a
t
a
[
[
n
u
m
,
2
]
]
D
i
v
i
1
Y
1
[
D
e
r
i
v
d
a
t
a
_
,
n
u
m
_
]
:
=
D
e
r
i
v
d
a
t
a
[
[
(
n
u
m
+
1
)
,
2
]
]
I
n
[
]
:
=
d
e
r
i
v
d
a
t
a
=
C
o
m
p
D
e
r
i
v
D
a
t
a
[
t
i
m
e
s
M
A
,
6
0
,
f
i
n
a
l
]
;
I
n
[
]
:
=
N
e
g
P
o
s
L
s
t
F
0
[
D
e
r
i
v
d
a
t
a
_
]
:
=
M
a
p
[
D
i
v
i
1
Y
[
D
e
r
i
v
d
a
t
a
,
#
]
&
,
R
a
n
g
e
[
L
e
n
g
t
h
[
D
e
r
i
v
d
a
t
a
]
-
1
]
]
N
e
g
P
o
s
L
s
t
F
1
[
D
e
r
i
v
d
a
t
a
_
]
:
=
M
a
p
[
D
i
v
i
1
Y
1
[
D
e
r
i
v
d
a
t
a
,
#
]
&
,
R
a
n
g
e
[
L
e
n
g
t
h
[
D
e
r
i
v
d
a
t
a
]
-
1
]
]
g
e
t
d
e
r
i
v
c
o
m
p
a
r
e
[
D
e
r
i
v
d
a
t
a
_
]
:
=
T
r
a
n
s
p
o
s
e
[
{
F
l
a
t
t
e
n
[
N
e
g
P
o
s
L
s
t
F
0
[
D
e
r
i
v
d
a
t
a
]
]
,
F
l
a
t
t
e
n
[
N
e
g
P
o
s
L
s
t
F
1
[
D
e
r
i
v
d
a
t
a
]
]
}
]
I
n
[
]
:
=
d
e
r
i
v
c
o
m
p
a
r
e
=
g
e
t
d
e
r
i
v
c
o
m
p
a
r
e
[
d
e
r
i
v
d
a
t
a
]
;
I
n
[
]
:
=
C
h
e
c
k
M
i
n
M
a
x
[
d
e
r
i
v
c
o
m
p
a
r
e
_
,
x
_
]
:
=
M
o
d
u
l
e
[
{
L
i
s
t
C
h
e
c
k
=
{
}
}
,
W
h
i
c
h
[
O
r
[
A
n
d
[
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
1
]
]
]
1
,
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
2
]
]
]
1
]
,
A
n
d
[
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
1
]
]
]
-
1
,
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
2
]
]
]
-
1
]
,
A
n
d
[
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
1
]
]
]
0
,
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
2
]
]
]
0
]
]
,
A
p
p
e
n
d
[
L
i
s
t
C
h
e
c
k
,
"
n
a
n
"
]
,
A
n
d
[
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
1
]
]
]
-
1
,
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
2
]
]
]
1
]
,
A
p
p
e
n
d
[
L
i
s
t
C
h
e
c
k
,
-
1
]
,
A
n
d
[
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
1
]
]
]
1
,
S
i
g
n
[
d
e
r
i
v
c
o
m
p
a
r
e
[
[
x
,
2
]
]
]
-
1
]
,
A
p
p
e
n
d
[
L
i
s
t
C
h
e
c
k
,
1
]
]
]
I
n
[
]
:
=
C
h
e
c
k
A
L
L
[
d
e
r
i
v
c
o
m
p
a
r
e
_
]
:
=
F
l
a
t
t
e
n
[
M
a
p
[
C
h
e
c
k
M
i
n
M
a
x
[
d
e
r
i
v
c
o
m
p
a
r
e
,
#
]
&
,
R
a
n
g
e
[
L
e
n
g
t
h
[
d
e
r
i
v
c
o
m
p
a
r
e
]
]
]
]
I
n
[
]
:
=
c
h
e
c
k
a
l
l
=
C
h
e
c
k
A
L
L
[
d
e
r
i
v
c
o
m
p
a
r
e
]
I
n
[
]
:
=
{
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
1
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
,
n
a
n
}
O
u
t
[
]
=
Now that we have the locations of all the zeroes, we can use them to find out where our function is increasing and decreasing. Unfortunately at the moment our function does not take into account the difference between decreases and increases or minimums and maximums when it is calculating volatility and just counts them both as “trend changes that indicate volatility”. However, there is a small way this feature is implemented (even though it can’t be used in the calculation) that you will see later. Hopefully, at some point these can be considered more heavily when calculating volatility.
getallzeroes[]
grabs the location of all maximums and minimums and puts them in a list. Zero is always at the beginning, and the
Length[]
of the
derivdata
is always the end value. This will allow us to choose lists to take the means of in the future, by indicating that we want the list to be separated into groups
{{0, zero_1},{zero_1, zero2}, ... {zero_i, Length[derivdata]}}
. We do this in
getydata
.
Now that we have our separated lists, we can separate lists further by checking for clusters in the data with
FindClusters[]
.
Finally, we get the size of the clusters so we know the intervals of all the groups. This is necessary in the case that
FindClusters[]
did split the data into more parts after finding zeroes.
g
e
t
a
l
l
z
e
r
o
e
s
[
d
e
r
i
v
d
a
t
a
_
,
c
h
e
c
k
a
l
l
_
]
:
=
J
o
i
n
[
{
0
}
,
S
o
r
t
[
J
o
i
n
[
F
l
a
t
t
e
n
[
P
o
s
i
t
i
o
n
[
c
h
e
c
k
a
l
l
,
1
]
]
,
F
l
a
t
t
e
n
[
P
o
s
i
t
i
o
n
[
c
h
e
c
k
a
l
l
,
-
1
]
]
]
]
,
{
L
e
n
g
t
h
[
d
e
r
i
v
d
a
t
a
[
[
A
l
l
,
2
]
]
]
}
]
;
I
n
[
]
:
=
a
l
l
z
e
r
o
e
s
=
g
e
t
a
l
l
z
e
r
o
e
s
[
d
e
r
i
v
d
a
t
a
,
c
h
e
c
k
a
l
l
]
I
n
[
]
:
=
{
0
,
3
8
,
4
7
}
O
u
t
[
]
=
g
e
t
y
d
a
t
a
[
d
e
r
i
v
d
a
t
a
_
,
a
l
l
z
e
r
o
e
s
_
]
:
=
T
a
k
e
L
i
s
t
[
d
e
r
i
v
d
a
t
a
[
[
A
l
l
,
2
]
]
,
D
i
f
f
e
r
e
n
c
e
s
[
a
l
l
z
e
r
o
e
s
]
]
;
I
n
[
]
:
=
y
d
a
t
a
=
g
e
t
y
d
a
t
a
[
d
e
r
i
v
d
a
t
a
,
a
l
l
z
e
r
o
e
s
]
;
I
n
[
]
:
=
M
e
a
n
S
h
i
f
t
F
i
n
d
P
a
t
h
[
y
d
a
t
a
_
,
x
_
]
:
=
M
e
a
n
S
h
i
f
t
F
i
l
t
e
r
[
x
,
I
n
t
e
g
e
r
P
a
r
t
[
L
e
n
g
t
h
@
y
d
a
t
a
/
1
0
]
,
M
e
d
i
a
n
D
e
v
i
a
t
i
o
n
@
x
,
M
a
x
I
t
e
r
a
t
i
o
n
s
1
0
]
F
i
n
d
M
e
a
n
s
[
y
d
a
t
a
_
]
:
=
M
e
a
n
S
h
i
f
t
F
i
n
d
P
a
t
h
[
y
d
a
t
a
,
#
]
&
/
@
y
d
a
t
a
I
n
[
]
:
=
F
C
[
x
_
]
:
=
F
l
a
t
t
e
n
[
F
i
n
d
C
l
u
s
t
e
r
s
[
x
]
]
;
F
i
n
d
C
l
l
u
s
t
e
r
s
[
y
d
a
t
a
_
]
:
=
F
C
[
#
]
&
/
@
F
i
n
d
M
e
a
n
s
[
y
d
a
t
a
]
;
g
e
t
s
i
z
e
c
l
u
s
t
e
r
s
[
c
l
u
s
t
e
r
s
_
]
:
=
A
c
c
u
m
u
l
a
t
e
[
L
e
n
g
t
h
[
c
l
u
s
t
e
r
s
[
[
#
]
]
]
&
/
@
R
a
n
g
e
[
L
e
n
g
t
h
[
c
l
u
s
t
e
r
s
]
]
]
;
I
n
[
]
:
=
c
l
u
s
t
e
r
s
=
F
i
n
d
C
l
l
u
s
t
e
r
s
[
y
d
a
t
a
]
;
I
n
[
]
:
=
s
i
z
e
c
l
u
s
t
e
r
s
=
g
e
t
s
i
z
e
c
l
u
s
t
e
r
s
[
c
l
u
s
t
e
r
s
]
I
n
[
]
:
=
{
3
8
,
4
7
}
O
u
t
[
]
=
Now we take the mean of all the values in each cluster. This will allow us to find where functions are increasing, decreasing, or dead.
A “dead” interval is considered to be any interval which is less than or equal to
wordDeathVal
times the mean of all other means. A small issue is that the mean of all other means does not consider the sign of the other means. So the calculated value if the other two means were
1
.
7
3
1
7
9
×
-
1
6
1
0
a
n
d
-
1
.
7
3
1
7
9
×
-
1
6
1
0
would be 0. This is an error because the intention is to detect on average how far from zero the other means are (absolute value). This is not a huge error, because dead intervals are very uncommon as compared to decreasing or increasing intervals anyway, but should be taken into account.
An increasing interval is any interval which is not dead and the mean is positive while a decreasing interval is any interval which is not dead and its mean is negative.
Intervals with means of zero are not taken into account because they would be dead by definition.
m
e
a
n
c
l
u
s
t
e
r
s
c
a
l
c
[
c
l
u
s
t
e
r
s
_
]
:
=
M
a
p
[
M
e
a
n
[
#
]
&
,
c
l
u
s
t
e
r
s
]
I
n
[
]
:
=
m
e
a
n
c
l
u
s
t
e
r
s
=
m
e
a
n
c
l
u
s
t
e
r
s
c
a
l
c
[
c
l
u
s
t
e
r
s
]
I
n
[
]
:
=
{
1
.
5
3
1
3
9
×
-
1
6
1
0
,
-
2
.
6
3
0
5
8
×
-
1
7
1
0
}
O
u
t
[
]
=
D
e
r
i
v
W
o
r
d
S
t
a
t
e
[
x
_
,
m
e
a
n
c
l
u
s
t
e
r
s
_
,
w
o
r
d
D
e
a
t
h
V
a
l
_
]
:
=
I
f
[
L
e
n
g
t
h
[
m
e
a
n
c
l
u
s
t
e
r
s
]
1
,
I
f
[
S
i
g
n
[
m
e
a
n
c
l
u
s
t
e
r
s
[
[
x
]
]
]
1
,
"
i
n
c
r
e
a
s
i
n
g
"
,
"
d
e
c
r
e
a
s
i
n
g
"
]
,
W
h
i
c
h
[
A
b
s
[
m
e
a
n
c
l
u
s
t
e
r
s
[
[
x
]
]
]
≤
T
i
m
e
s
[
w
o
r
d
D
e
a
t
h
V
a
l
,
M
e
a
n
[
D
r
o
p
[
m
e
a
n
c
l
u
s
t
e
r
s
,
{
x
}
]
]
]
,
"
d
e
a
d
"
,
S
i
g
n
[
m
e
a
n
c
l
u
s
t
e
r
s
[
[
x
]
]
]
1
,
"
i
n
c
r
e
a
s
i
n
g
"
,
S
i
g
n
[
m
e
a
n
c
l
u
s
t
e
r
s
[
[
x
]
]
]
-
1
,
"
d
e
c
r
e
a
s
i
n
g
"
]
]
;
I
n
[
]
:
=
Now we generate a string, called statement, that will tell us which intervals are dead, increasing and decreasing. The locations of derivative values are not mapped to time yet, so it is only a rough indication to the user where something is increasing and where it is decreasing.
statementClusters[]
can print out a statement for the user. To view the output of
statementClusters[]
in
RunWord[]
, use the commented line for the final line as
RunWord[]
. We use this statement to determine the volatility score of that word. A future feature would be to implement a output function that will show intervals with corresponding timestamp information.
s
t
a
t
e
m
e
n
t
C
l
u
s
t
e
r
s
[
s
i
z
e
c
l
u
s
t
e
r
s
_
,
m
e
a
n
c
l
u
s
t
e
r
s
_
,
w
o
r
d
D
e
a
t
h
V
a
l
_
,
a
l
l
z
e
r
o
e
s
_
,
w
o
r
d
_
]
:
=
S
t
r
i
n
g
J
o
i
n
[
M
o
d
u
l
e
[
{
s
t
a
t
e
m
e
n
t
=
{
S
t
r
i
n
g
J
o
i
n
[
"
T
h
e
u
s
a
g
e
o
f
"
,
w
o
r
d
,
"
i
s
"
]
}
,
n
=
1
}
,
W
h
i
l
e
[
n
<
L
e
n
g
t
h
[
s
i
z
e
c
l
u
s
t
e
r
s
]
+
1
,
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
D
e
r
i
v
W
o
r
d
S
t
a
t
e
[
n
,
m
e
a
n
c
l
u
s
t
e
r
s
,
w
o
r
d
D
e
a
t
h
V
a
l
]
]
;
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
"
o
n
"
]
;
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
T
o
S
t
r
i
n
g
[
a
l
l
z
e
r
o
e
s
[
[
n
]
]
]
]
;
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
"
t
h
r
o
u
g
h
"
]
;
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
T
o
S
t
r
i
n
g
[
a
l
l
z
e
r
o
e
s
[
[
n
+
1
]
]
]
]
;
A
p
p
e
n
d
T
o
[
s
t
a
t
e
m
e
n
t
,
"
,
"
]
;
n
+
+
]
;
s
t
a
t
e
m
e
n
t
]
]
I
n
[
]
:
=
s
t
a
t
e
m
e
n
t
=
s
t
a
t
e
m
e
n
t
C
l
u
s
t
e
r
s
[
s
i
z
e
c
l
u
s
t
e
r
s
,
m
e
a
n
c
l
u
s
t
e
r
s
,
w
o
r
d
D
e
a
t
h
V
a
l
,
a
l
l
z
e
r
o
e
s
,
w
o
r
d
]
I
n
[
]
:
=
T
h
e
u
s
a
g
e
o
f
c
a
t
i
s
i
n
c
r
e
a
s
i
n
g
o
n
0
t
h
r
o
u
g
h
3
8
,
d
e
c
r
e
a
s
i
n
g
o
n
3
8
t
h
r
o
u
g
h
4
7
,
O
u
t
[
]
=
Finally, the volatility is calculated by counting how many times the words “decreasing” and “increasing” are used in
statement
. “dead” intervals are not counted in the volatility score.
A future iteration might take into account how large of an increase or decrease is within the intervals and the absolute value of their means.
Another future implementation would be expanding further on the statement function within volatility.
RunWord[]
runs all necessary calculations to fetch the volatility value of a particular word.
g
e
t
V
o
l
a
t
i
l
i
t
y
W
o
r
d
[
s
t
a
t
e
m
e
n
t
_
]
:
=
L
e
n
g
t
h
[
P
o
s
i
t
i
o
n
[
T
e
x
t
W
o
r
d
s
[
s
t
a
t
e
m
e
n
t
]
,
"
i
n
c
r
e
a
s
i
n
g
"
]
]
+
L
e
n
g
t
h
[
P
o
s
i
t
i
o
n
[
T
e
x
t
W
o
r
d
s
[
s
t
a
t
e
m
e
n
t
]
,
"
d
e
c
r
e
a
s
i
n
g
"
]
]
I
n
[
]
:
=
g
e
t
V
o
l
a
t
i
l
i
t
y
W
o
r
d
[
s
t
a
t
e
m
e
n
t
]
I
n
[
]
:
=
1
O
u
t
[
]
=
R
u
n
W
o
r
d
[
w
o
r
d
_
,
m
a
v
a
l
_
,
m
a
D
E
R
v
a
l
_
,
w
o
r
d
D
e
a
t
h
V
a
l
_
,
l
a
n
g
u
a
g
e
_
]
:
=
M
o
d
u
l
e
[
{
w
o
r
d
d
a
t
a
,
t
i
m
e
s
,
f
r
e
q
s
,
F
u
l
l
D
a
t
a
,
t
i
m
e
s
M
A
,
f
r
e
q
s
M
A
,
f
i
n
a
l
,
d
e
r
i
v
d
a
t
a
,
d
e
r
i
v
c
o
m
p
a
r
e
,
c
h
e
c
k
a
l
l
,
a
l
l
z
e
r
o
e
s
,
y
d
a
t
a
,
c
l
u
s
t
e
r
s
,
s
i
z
e
c
l
u
s
t
e
r
s
,
m
e
a
n
c
l
u
s
t
e
r
s
,
s
t
a
t
e
m
e
n
t
,
i
n
i
t
s
t
a
t
e
m
e
n
t
}
,
w
o
r
d
d
a
t
a
=
g
e
t
w
o
r
d
d
a
t
a
[
w
o
r
d
,
l
a
n
g
u
a
g
e
]
;
t
i
m
e
s
=
w
o
r
d
d
a
t
a
[
"
T
i
m
e
s
"
]
;
f
r
e
q
s
=
w
o
r
d
d
a
t
a
[
"
V
a
l
u
e
s
"
]
;
t
i
m
e
s
M
A
=
t
i
m
e
s
M
A
c
a
l
c
[
t
i
m
e
s
,
m
a
v
a
l
]
;
f
r
e
q
s
M
A
=
f
r
e
q
M
A
c
a
l
c
[
f
r
e
q
s
,
m
a
v
a
l
]
;
f
i
n
a
l
=
f
i
n
a
l
c
a
l
c
[
t
i
m
e
s
M
A
,
f
r
e
q
s
M
A
]
;
d
e
r
i
v
d
a
t
a
=
C
o
m
p
D
e
r
i
v
D
a
t
a
[
t
i
m
e
s
M
A
,
m
a
D
E
R
v
a
l
,
f
i
n
a
l
]
;
d
e
r
i
v
c
o
m
p
a
r
e
=
g
e
t
d
e
r
i
v
c
o
m
p
a
r
e
[
d
e
r
i
v
d
a
t
a
]
;
c
h
e
c
k
a
l
l
=
C
h
e
c
k
A
L
L
[
d
e
r
i
v
c
o
m
p
a
r
e
]
;
a
l
l
z
e
r
o
e
s
=
g
e
t
a
l
l
z
e
r
o
e
s
[
d
e
r
i
v
d
a
t
a
,
c
h
e
c
k
a
l
l
]
;
y
d
a
t
a
=
g
e
t
y
d
a
t
a
[
d
e
r
i
v
d
a
t
a
,
a
l
l
z
e
r
o
e
s
]
;
c
l
u
s
t
e
r
s
=
F
i
n
d
C
l
l
u
s
t
e
r
s
[
y
d
a
t
a
]
;
s
i
z
e
c
l
u
s
t
e
r
s
=
g
e
t
s
i
z
e
c
l
u
s
t
e
r
s
[
c
l
u
s
t
e
r
s
]
;
m
e
a
n
c
l
u
s
t
e
r
s
=
m
e
a
n
c
l
u
s
t
e
r
s
c
a
l
c
[
c
l
u
s
t
e
r
s
]
;
s
t
a
t
e
m
e
n
t
=
s
t
a
t
e
m
e
n
t
C
l
u
s
t
e
r
s
[
s
i
z
e
c
l
u
s
t
e
r
s
,
m
e
a
n
c
l
u
s
t
e
r
s
,
w
o
r
d
D
e
a
t
h
V
a
l
,
a
l
l
z
e
r
o
e
s
,
w
o
r
d
]
;
g
e
t
V
o
l
a
t
i
l
i
t
y
W
o
r
d
[
s
t
a
t
e
m
e
n
t
]
(
*
C
o
l
u
m
n
[
{
s
t
a
t
e
m
e
n
t
,
g
e
t
V
o
l
a
t
i
l
i
t
y
W
o
r
d
[
s
t
a
t
e
m
e
n
t
]
}
]
*
)
]
R
u
n
W
o
r
d
[
"
c
h
a
t
"
,
m
a
v
a
l
,
m
a
D
E
R
v
a
l
,
w
o
r
d
D
e
a
t
h
V
a
l
,
"
F
r
e
n
c
h
"
]
I
n
[
]
:
=
3
O
u
t
[
]
=
Below are my global variables.
word
is only applicable to
RunWord[]
. It cannot be used by
CollectList[]
,
TranslateWords[]
, or
TotalVolatility[]
.
maval
refers to the variable of moving average applied to raw time and frequency data.
maDERval
refers to the variable of moving average applied to derived time and frequency data.
wordDeathVal
allows us to decide at what relative point to its sister clusters a mean on an interval represents a “dead” interval
language
decides the language we will be analyzing the derivative of.
wordlist
is my list of 151 words I decided might have somewhat consistent use across languages. Feel free to use the sublists I indicated in the introduction, or create your own.
maval
and
maDERval
both work functionally, so long as they are near or above 30. If they are significantly below, volatility scores begin to grow significantly and become somewhat arbitrary, only really indicating the amount of data (or lack thereof), not its actual volatility.
wordDeathVal
was an arbitrary value I chose based on what I thought signified word movement death, and can be optimized. Please let me know if you know more about this/have suggestions.
language
,
word
, and
wordlist
can be changed to the user’s liking.
(
*
s
e
t
u
p
o
f
w
o
r
d
s
a
n
d
i
n
i
t
i
a
l
v
a
r
i
a
b
l
e
s
*
)
w
o
r
d
=
"
c
a
t
"
;
m
a
v
a
l
=
2
9
;
m
a
D
E
R
v
a
l
=
6
0
;
w
o
r
d
D
e
a
t
h
V
a
l
=
.
0
1
;
l
a
n
g
u
a
g
e
=
"
E
n
g
l
i
s
h
"
w
o
r
d
s
l
i
s
t
=
{
"
c
a
t
"
,
"
d
o
g
"
,
"
h
e
l
l
o
"
,
"
g
o
o
d
b
y
e
"
,
"
t
r
e
e
"
,
"
w
a
t
e
r
"
,
"
f
o
o
d
"
,
"
l
o
v
e
"
,
"
l
i
f
e
"
,
"
d
e
a
t
h
"
,
"
c
l
o
u
d
"
,
"
r
i
v
e
r
"
,
"
f
a
m
i
l
y
"
,
"
f
r
i
e
n
d
"
,
"
m
o
m
"
,
"
d
a
d
"
,
"
b
r
o
t
h
e
r
"
,
"
s
i
s
t
e
r
"
,
"
r
e
d
"
,
"
y
e
l
l
o
w
"
,
"
g
r
e
e
n
"
,
"
f
i
s
h
"
,
"
f
l
o
w
e
r
"
,
"
s
c
i
e
n
c
e
"
,
"
b
i
r
d
"
,
"
b
o
o
k
"
,
"
c
o
u
n
t
r
y
"
,
"
c
i
t
y
"
,
"
c
h
i
l
d
"
,
"
r
o
c
k
"
,
"
c
l
o
t
h
e
s
"
,
"
c
h
i
l
d
h
o
o
d
"
,
"
i
n
n
o
c
e
n
c
e
"
,
"
h
a
p
p
i
n
e
s
s
"
,
"
o
l
d
"
,
"
y
o
u
n
g
"
,
"
e
y
e
s
"
,
"
c
l
o
c
k
"
,
"
m
u
s
i
c
"
,
"
a
r
t
"
,
"
n
o
r
m
a
l
"
,
"
d
o
c
t
o
r
"
,
"
s
o
l
d
i
e
r
"
,
"
w
i
f
e
"
,
"
h
u
s
b
a
n
d
"
,
"
l
e
a
f
"
,
"
w
o
r
d
"
,
"
s
t
o
r
y
"
,
"
b
e
a
u
t
y
"
,
"
j
o
b
"
,
"
w
o
r
k
"
,
"
l
a
z
y
"
,
"
p
r
e
t
t
y
"
,
"
u
n
i
q
u
e
"
,
"
m
o
u
n
t
a
i
n
"
,
"
f
o
r
e
s
t
"
,
"
e
l
d
e
r
"
,
"
b
a
b
y
"
,
"
k
i
n
d
"
,
"
s
m
a
r
t
"
,
"
p
u
r
p
l
e
"
,
"
d
a
y
"
,
"
n
i
g
h
t
"
,
"
s
u
n
"
,
"
s
t
a
r
s
"
,
"
u
g
l
y
"
,
"
g
r
a
s
s
"
,
"
h
o
r
s
e
"
,
"
h
e
l
p
"
,
"
t
i
m
e
"
,
"
s
t
u
d
y
"
,
"
w
a
r
"
,
"
h
a
t
e
"
,
"
t
h
u
n
d
e
r
"
,
"
f
i
r
e
"
,
"
b
l
o
o
d
"
,
"
w
o
m
a
n
"
,
"
m
a
n
"
,
"
a
d
u
l
t
"
,
"
s
c
h
o
o
l
"
,
"
a
n
g
e
r
"
,
"
s
a
d
n
e
s
s
"
,
"
l
o
n
e
l
i
n
e
s
s
"
,
"
n
u
m
b
e
r
"
,
"
f
e
e
l
i
n
g
"
,
"
c
o
l
d
"
,
"
w
a
r
m
t
h
"
,
"
h
o
t
"
,
"
s
p
r
i
n
g
"
,
"
s
u
m
m
e
r
"
,
"
f
a
l
l
"
,
"
w
i
n
t
e
r
"
,
"
m
o
n
e
y
"
,
"
y
e
s
"
,
"
n
o
"
,
"
p
o
w
e
r
"
,
"
a
n
i
m
a
l
"
,
"
m
e
a
t
"
,
"
t
e
a
c
h
e
r
"
,
"
s
n
o
w
"
,
"
h
e
a
r
t
"
,
"
m
o
u
t
h
"
,
"
m
i
l
i
t
a
r
y
"
,
"
s
o
n
"
,
"
d
a
u
g
h
t
e
r
"
,
"
r
i
c
h
"
,
"
p
o
o
r
"
,
"
m
e
a
n
"
,
"
e
v
i
l
"
,
"
l
i
g
h
t
"
,
"
d
a
r
k
n
e
s
s
"
,
"
s
a
f
e
t
y
"
,
"
d
a
n
g
e
r
"
,
"
h
u
m
a
n
"
,
"
e
m
p
a
t
h
y
"
,
"
b
o
y
"
,
"
g
i
r
l
"
,
"
q
u
i
e
t
"
,
"
l
o
u
d
"
,
"
n
o
s
e
"
,
"
h
a
i
r
"
,
"
l
e
a
d
e
r
"
,
"
h
o
u
s
e
"
,
"
h
u
n
g
e
r
"
,
"
h
o
p
e
"
,
"
s
l
e
e
p
"
,
"
b
e
d
"
,
"
s
k
y
"
,
"
a
i
r
"
,
"
c
l
e
a
n
"
,
"
d
i
r
t
y
"
,
"
c
o
l
o
r
"
,
"
n
a
m
e
"
,
"
d
r
e
a
m
"
,
"
p
e
a
c
e
"
,
"
f
a
r
m
e
r
"
,
"
r
a
i
n
"
,
"
t
r
u
e
"
,
"
s
t
u
d
e
n
t
"
,
"
h
e
a
r
"
,
"
e
a
r
"
,
"
l
e
g
"
,
"
a
r
m
"
,
"
h
a
n
d
"
,
"
t
a
s
t
e
"
,
"
f
l
o
o
r
"
,
"
b
u
i
l
d
i
n
g
"
,
"
b
u
s
i
n
e
s
s
"
,
"
r
e
l
a
x
"
,
"
s
t
r
e
s
s
"
,
"
b
e
a
c
h
"
}
;
I
n
[
]
:
=
Finally, here are our functions that will allow us to summarize estimated volatility for an entire language.
CollectList[]
creates a list of all outputted volatility scores from
RunWord[]
for a certain list of words in a particular language. It will not be able to translate words if it is run on its own, and would need to be inputted a list with words in the particular language the user would like to use.
TranslateWords[]
creates a list of translated words from an English list of words, into another language of the user’s choosing. If the chosen language is English, it will not run the translation.
DISCLAIMER:
This function uses service credits. If you plan to calculate volatility for any language that is NOT English, even if you translated the wordlist by yourself, do not use
TotalVolatility[]
. It will automatically attempt to translate any list whose language was not indicated as English.
The reason I use
TextTranslation[]
(which costs service credits, as it relies on Microsoft data) instead of
WordTranslation[]
(which does not, it relies on Wolfram data) is because
WordTranslation[]
was inaccurate and after testing it with a few of the words in my
wordslist
,
I decided against using it at this time. An example is shown below, where it suggests the translations for “hair” might be “decapitate”, “guillotine”, or one of the actual translations for “hair”. In general, I have seen that oftentimes, even when the function does choose the correct translations, they are less commonly used translations of the word (for example, poil as seen below is considered to be a less common word for hair in French), which would throw off the volatility in comparison to the volatility chosen for its corresponding word in English, in turn making the total volatility score incomparable to the one chosen for English, because uncommon words tend to have different derivative graphs than common words. Meanwhile,
TextTranslation[]
appears to consistently chose an accurate word that is the most used in its given language. I hope that in the future, this
WordTranslation[]
is improved upon so my code can be run in other languages without using service credits.
TotalVolatility[]
runs
TranslateWords[]
, then inputs them into
CollectList[]
, which runs each word in
TranslateWords[]
in its corresponding language in
RunWord[]
. Then
TotalVolatility[]
takes the mean of all volatilities collected in
CollectList[]
. This will give the user a total “estimated volatility score” for the given language.
W
o
r
d
T
r
a
n
s
l
a
t
i
o
n
[
"
h
a
i
r
"
,
"
F
r
e
n
c
h
"
]
I
n
[
]
:
=
{
"
d
é
c
a
p
i
t
e
r
"
,
"
g
u
i
l
l
o
t
i
n
e
r
"
,
"
p
o
i
l
"
}
T
e
x
t
T
r
a
n
s
l
a
t
i
o
n
[
"
h
a
i
r
"
,
"
F
r
e
n
c
h
"
]
"
C
h
e
v
e
u
x
"
C
o
l
l
e
c
t
L
i
s
t
[
w
o
r
d
s
l
i
s
t
_
,
m
a
v
a
l
_
,
m
a
D
E
R
v
a
l
_
,
w
o
r
d
D
e
a
t
h
V
a
l
_
,
l
a
n
g
u
a
g
e
_
]
:
=
M
o
d
u
l
e
[
{
l
i
s
t
=
{
}
}
,
M
a
p
[
R
u
n
W
o
r
d
[
w
o
r
d
s
l
i
s
t
[
[
#
]
]
,
m
a
v
a
l
,
m
a
D
E
R
v
a
l
,
w
o
r
d
D
e
a
t
h
V
a
l
,
l
a
n
g
u
a
g
e
]
&
,
R
a
n
g
e
[
L
e
n
g
t
h
[
w
o
r
d
s
l
i
s
t
]
]
]
]
I
n
[
]
:
=
T
r
a
n
s
l
a
t
e
W
o
r
d
s
[
w
o
r
d
s
l
i
s
t
_
,
l
a
n
g
u
a
g
e
_
]
:
=
I
f
[
l
a
n
g
u
a
g
e
"
E
n
g
l
i
s
h
"
,
w
o
r
d
s
l
i
s
t
,
M
a
p
[
T
e
x
t
T
r
a
n
s
l
a
t
i
o
n
[
w
o
r
d
s
l
i
s
t
[
[
#
]
]
,
l
a
n
g
u
a
g
e
]
&
,
R
a
n
g
e
[
L
e
n
g
t
h
[
w
o
r
d
s
l
i
s
t
]
]
]
]
I
n
[
]
:
=
T
o
t
a
l
V
o
l
a
t
i
l
i
t
y
[
w
o
r
d
s
l
i
s
t
_
,
m
a
v
a
l
_
,
m
a
D
E
R
v
a
l
_
,
w
o
r
d
D
e
a
t
h
V
a
l
_
,
l
a
n
g
u
a
g
e
_
]
:
=
M
o
d
u
l
e
[
{
l
i
s
t
t
r
a
n
=
{
}
}
,
l
i
s
t
t
r
a
n
=
T
r
a
n
s
l
a
t
e
W
o
r
d
s
[
w
o
r
d
s
l
i
s
t
,
l
a
n
g
u
a
g
e
]
;
N
[
M
e
a
n
[
C
o
l
l
e
c
t
L
i
s
t
[
l
i
s
t
t
r
a
n
,
m
a
v
a
l
,
m
a
D
E
R
v
a
l
,
w
o
r
d
D
e
a
t
h
V
a
l
,
l
a
n
g
u
a
g
e
]
]
]
]
Although it takes a while to run,
TotalVolatility[]
gives us a really interesting number we can use to make assumptions about a particular language! I plan to also implement
TotalVolatility[]
showing the user the standard deviation of the items in
CollectList[]
.
Below is the volatility score for English, but I recommend testing this on your own to see what you get in different languages!
T
o
t
a
l
V
o
l
a
t
i
l
i
t
y
[
w
o
r
d
s
l
i
s
t
,
m
a
v
a
l
,
m
a
D
E
R
v
a
l
,
w
o
r
d
D
e
a
t
h
V
a
l
,
"
E
n
g
l
i
s
h
"
]
I
n
[
]
:
=
2
.
8
4
1
0
6
O
u
t
[
]
=
Conclusion/Future Work
The potential to check for language volatility across different languages opens a whole new view into computational linguistics, allowing us to think harder about how to quantify the driving forces of language change, and how many different factors including population of speakers, cultural context, and other factors might play a role in the change of languages in addition to quantifiable natural variations. I hope that this project will excite other linguists and coders and help lead to future studies on different volatilities between languages. As data is collected, it will be added to this notebook and updated.
In the future there are many different features I’d like to be able to apply to my function, many of which I mentioned as annotations in my code. For one, I’d like to be able to make this function more accessible by adding dynamic functions and making it more user friendly. I think by adding user friendly interface, my function might be able to reach people who aren’t necessarily coders but are still interested in the subject, or inspire someone to take up doing a similar project. I’d also like to make the derivation of volatility more mathematically based with more complex equations that will give more insight into the true volatility score, and have variables (like
wordDeathVal
and my
wordslist
) decided by actual research and study, instead of just intuition. Finally, one thing that continually inspired me was the simple idea on this thread
https://mathematica.stackexchange.com/questions/16723/time-series-decomposition-in-mathematica/16845#16845
of decomposing data into trend, seasonal, and random. Currently my data only really considers trend, but I’d love to be able to utilize the users methods above to find seasonal trends across words.
Acknowledgements
A big thank you to my mentor, Megan Davis, who was always there to help when I was stuck on something or went too far down a rabbit hole.
Thank you to Jessica Shi, who helped me organize my (verifiably messy) code.
I learned so much doing this project, and I’m so excited to help it improve in the future!
POSTED BY:
Noemi Chulo
Answer
Mark as an Answer
Reply
|
Flag
Reply to this discussion
in reply to
Add Notebook
Community posts can be styled and formatted using the
Markdown syntax
.
Tag limit exceeded
Note: Only the first five people you tag will receive an email notification; the other tagged names will appear as links to their profiles.
Publish anyway
Cancel
Reply Preview
Attachments
Remove
Add a file to this post
Follow this discussion
or
Discard
Group Abstract
Be respectful. Review our
Community Guidelines
to understand your role and responsibilities.
Community Terms of Use
Feedback
Enable JavaScript to interact with content and submit forms on Wolfram websites.
Learn how »