Split up tab seperated textfile
I’m trying to import a .txt file with several columns in it that can all contain sentences with any number of words. What i would like to have is a method of outputting each of those sentences to a different outlet. The columns are seperated by a tab.
I am using the [text] object to open the textfile and have tried to use the [zl] object to break it up, but i haven’t found a way to split it at the tabs. I seem to only be able to shave off an x-amount of words.
Is there something i’m overlooking? Is there a way to split up a line in a .txt file by tab seperated column?
To give an example of two sentences -
check out the [regexp] object
can you just re-create the original file with only one sentence per line? that is, do you need them to be in separate columns?
[regexp] is a great idea for many many things, but…would it recognize tabs as tabs, or just as whitespace? maybe you could check if there were multiple spaces in a row. I am far from an expert on that object so I could be way off.
Another way is to use [atoi] and see when you get a tab character, which should be the number 9, I think.
Thanks for both suggestions. I tried to look into the regexp object yesterday, including the tutorial patch with usage examples, but i have no idea where to start with that, since i’ve never worked with regular expressions before. Is there anyone who could give me some pointers on that, maybe an example patch?
Seejay: i don’t think so, it’s an export from a larger Excel file that someone else makes (copy/paste), which exists out of three columns. To manually change the layout each time would slow down the process too much.
I will look into the atoi object tonight. Thank you for the suggestion!
how’s this for you
essentially, you have a char class of everything thats not a tab [^\t], group by multiple of these ([^\t]*), which will just split by tab (\t), and that’s what goes into each substring. Then take the substrings output from the [regexp] object.
----------begin_max5_patcher---------- 594.3ocwV11aaBCDG+0To9cvxuZaJqBadHvd29bzzUYHdItCLHroKoU669rO GZdXMKj0UmHAVb2Y6K+3uuimu9p.bQyJtBi9B5VTPvyFKAfMqkfACA3Z1pxJ lBBDWyUJ1BNdxFmZ9JM3nno3Eix9ZgrhqgoPGrJlCA1T7vmIQ6FaSudHXxf4 u2H0JwSbvH8lvA6tX0qa4t7FiQ2M3qkoKWJjKtuiWpctiowl4hhxirCIg16j raB2NI6FIY0vxg+ZmfUgAO+55qrilgIuc3LCWuFY2EjPMC+h+JgjW1zKgfhF O8ndhdQoI.8RR1Ru3TuSuk7pplyPbQ7EdhRA7PR8g3Rx+o4e2eRmdYKq7GH0 55CudcjQdMjEdDjkb9HaxAWm.g.0no4.7lBjjF9twv95Bd234R9QvB8rwhvb H2PiBlbwIHxzXGJh2Jp1GHsrNCNz7t64RVQErCg9Tw8TEphKG+Axr+aTbDBp LmRh5pbk+9pm96krP6W1GcVsH8VGR.QQoScmCuPEw53K3qZQe31uMy7Se2m9 33OjRuP0tHotSp4D6PVxERpMyI1lo2QqoMRsc9NiS2xzShMhoiYxDTNTlm7u 0wzEK7sSG9sqvNZcb.HUM8ckCY3FIChtcSmyUZgjoEMxcBx1TDsSpsTLeNWt W014BksBL.xvi+pcz40ga4QRLyqKDw+IF4jIF8xjXm9UYj2SrQIwx7tBaTok +oEYL5dp2oU1XRq72TZsobFqs8QdmZyJ6xHSCfGZ5rOmNw8rP5d1sxltlOJF lRrwBrpla+1MV7mc -----------end_max5_patcher-----------
That works like a charm, thanks big_pause!
There is one small inconvenience with it, but i think that has more to do with the way the [text] object processes its input:
If you import a line from a text document with tabs in it, they get converted to normal spaces. Your method works perfecly if you put "" at the beginning of the sentence in the text document, but that means that you have to copy the text from Excel, and then manually add the quotation marks. Is there a way to import text without having to use the quotation mark and still keep the tabs in there after import?
Hmm, never realised that, you learn something new every day.
Unless anyone has some bright ideas, I think its roll your own text file reader time in js or java. Seems a bit shit though.
Yeah, and i still have to try this out, but i think i could probably get away with having a first column with a quotation mark in it for every line. Small inconvenience.
Also, if i don’t use Excel but Open Office you can save as a .csv file where it will put quotation marks around every column, so i could probably look into how to break it up by content within quotation marks. :)