Thank you for the response Dorian.
... not all byte sequences are a valid String
, so, when one stores bytes in a string, the data needs to be validated
Do you mean the reverse, i.e. that not all String
s are a valid byte sequence? This gives True
:
tup = Tuples[Range[0, 255], {2}];
tup2 = ToCharacterCode /@ FromCharacterCode /@ tup;
tup2 === tup
All possible byte values, including 0
, seem to be storable in String
s.
But either way, it is clear enough that a dedicated ByteArray
is better for storing byte data than a string. That doesn't need to be explained further.
Other than the ones I mentioned, are there any operations we can perform on ByteArray
s (especially other things than element extraction)?
Here are a few more suggestions, in addition to the ones I already mentions:
Efficiently changing elements in-place through Part
:
a = ByteArray[...];
a[[2]] = 5;
(This should also support Span
, i.e. ;;
)
Append
, Prepend
, AppendTo
, PrependTo
.
Something like Partition
to break a big array into parts. My envisioned use case is processing a large ByteArray
without unpacking the whole thing to an integer list. Instead, we could unpack small sections at a time, process them, then re-pack them. So perhaps other methods, such as Map
, BlockMap
, etc. are more appropriate. (E.g., Audio
has AudioBlockMap
). If ByteCount
can be trusted, there is a storage overhead of 96 bytes, so perhaps complete pre-Partition
-ing is not the best.
Direct creation functions: Analogues of ConstantArray
(large constant byte array) and RandomInteger
(for random bytes).
But the most important missing functionality is conversion:
- to/from strings (like FromCharacterCode, ToCharacterCode)
- to/from files (BinaryRealList, BinaryWrite)
- and very importantly: LibraryLink. Conversion to/from RawArray would suffice, as
RawArray
s work with LibraryLink since version 10.4. This would allow us to implement efficient functions for anything we need. I looked into the implementation of some of the built-in functions, and I see that currently sending to/from LibraryLink is done through an inefficient conversion to a 64-bit integer list (i.e. {Integer, 1}
LibraryLink type).