Group Abstract

Message Boards

2.1K Views

6 Replies

1 Total Like

View groups...

Follow this post

Share this post:

GROUPS:

Data Science Wolfram Language

Posted 1 year ago

I have the following xlsx dataset: I want to import only the `"Age"` and `"State"` columns into my notebook. Can I use `"SkipLines"` to achieve it? What I do now is Import["example-dataset.xlsx", {"Dataset",(Import as a dataset) 1,(Import only the 1st sheet) All,(Import all rows) {1, 2, 4}},(Import only the 1st, 2nd and 4th columns) "HeaderLines" -> {1, 1} (1st row and 1st column are headers) ] I tried `Import["example-dataset.xlsx",{"Dataset", 1}, "HeaderLines" -> {1, 1}, "SkipLines" -> {0, {2, 4}}]` but it doesn't work. I prepared this spreadsheet in Google sheets. For some reason, the sheet is 1000 rows long, so there are a lot of empty rows below the last meaningful row: How can I tell Wolfram not to import those ~1000 empty rows into the dataset? I tried `"EmptyField" -> None` and `"EmptyField" -> Null` but it doesn't help. Maybe I should use `SkipLines->17;;` to skip all lines but it doesn't work either. Attachments: example-dataset.xlsx

POSTED BY: Ehud Behar

6 Replies

Sort By:

Posted 1 year ago

Found an alternative that is more compact than my earlier suggestion: Import["example-dataset.xlsx",{"Dataset",1,1;;16,{2,4}},"HeaderLines"->1]

POSTED BY: Hans Milton

Posted 1 year ago

Thanks very much! But do you think that SkipLines can be used to simply ignore the rest of the sheet? As I said in the question, `SkipLines->17;;`?

POSTED BY: Ehud Behar

Posted 1 year ago

It seems that it is not possible to use `Span` in the "`SkipLines`" option

POSTED BY: Hans Milton

Posted 1 year ago

Before trying to specify ranges to the SkipLines option, I think I first need to understand its usage. Suppose I want to skip lines 5, 6 and 7. Then why neither `"SkipLines" -> {{5, 6, 7}}` nor `"SkipLines" -> {5, 6, 7}` work? By the way, see Rohit Namjoshi comment for a similar question I asked.

POSTED BY: Ehud Behar

Posted 1 year ago

The documentation for "`SkipLines`" refers only to ranges that start from the first line

POSTED BY: Hans Milton

Posted 1 year ago

Try this to remove the unwanted lines: ds=Import["example-dataset.xlsx",{"Dataset",1,All,{1,2,4}},"HeaderLines"->{1,1}][[1;;15]] Getting rid of the first column: Normal[ds] // Values // Dataset

POSTED BY: Hans Milton

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback