Message Boards Message Boards

0
|
6641 Views
|
1 Reply
|
0 Total Likes
View groups...
Share
Share this post:

Creating dataset from a file.txt

Posted 10 years ago

After mathematica 10.0 release, I found the new interesting function association, dataset, etc. I really did not have much pay attention until now, as I was always working with theoretical things.. Well, I am working hard through huge amount of astronomical data. I need to correlate some parameters for each star of a spectral type. I need to separate lot of catalogues, and each star separate its coord galactic (lat and long), paralax + errors +quality,spectral type , like a descending tree:

CATALOGUES ->.....

-> catalogue A: -> star...
-> galactic ....., -> paralax...., -> spectral type...., -> identifiers.... ( a lot of nomenclatures, they are for the same star but different catalogues)

catalogue B -> star ...
-> galactic , etc..

....

I tried this:

FIRST I PICK UP THE STAR FROM THE .TXT, here a link for a sample: The sample txt of a certain catalogue

 $1 = OpenRead[
       "C:\\Users\\decicco\\SkyDrive\\Documentos\\ProjetoFinal\\Simbad\\simbadART_\
    Teste_dataMining.txt"];
    $2 = ReadList[$1, Record, RecordSeparators -> {{"Object "}, {" ---"}}]
    Close[$1];


    Out[3]= {"HR 5027 ", "HR 5036 "}


 $1 = OpenRead[
       "C:\\Users\\decicco\\SkyDrive\\Documentos\\ProjetoFinal\\Simbad\\Teste_\
    dataMining.txt"];
    $2 = ReadList[$1, String];
    Close[$1];


Coordenadas Galaticas



   Flatten@StringCases[$2, 
      "Coordinates(Gal,ep=J2000,eq=2000): " ~~ (x : 
          NumberString ...) ~~ (y : ___ ~~ NumberString ...) -> 
       ToExpression@{x, y}]

    Out[8]= {"307.0804", "  +06.8343         ", "307.7283", "  10.4014         "}

Paralaxes erro e qualidade

Flatten@StringCases[$2, 
  "Parallax: " ~~ (x : ___ ~~ NumberString ...) ~~ 
    "[" ~~ (y : ___ ~~ NumberString ...) ~~ "]" ~~ z : LetterCharacter -> 
   ToExpression@{x, y, z}] (*here I have to get the paralax + error and the quality of paralax that usually is A, but could be B or C, also*)

Out[41]= {} 

Tipo Espectral

In[15]:= Flatten@StringCases[$2, "Spectral type: " ~~ x : ___ ~~ Except["~"] -> x]

Out[15]= {"B0.5Ia C ~                 ", "B2.5Ib C ~                 "} (*I dont want "~"  ...*)

Identificadores de catálogos


In[44]:= Flatten@StringCases[$2, 
  RegularExpression["(?m)^Identifiers "] ~~ "(" ~~ DigitCharacter ~~ ") :" ~~ 
    x__ ~~ RegularExpression["(?m)^Notes "] -> x] (* I need get the nomenclatures only , in separates comas*)

Out[44]= {}

After getting all of this parameters I want to build a data set for each catalogue I create. And these catalogues subdatasets would be inside a dataset.

Thanks!

POSTED BY: Marcelo De Cicco

I did confusion, I realize that dataset is not the same as association, in Wolfram language, so I really do not know which form is better

POSTED BY: Marcelo De Cicco
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract