-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nested data in CSV #15
Comments
well, that's because CSV is a flat format, it doesn't support "nesting", you can work around it by using methods to extract/generate data, that way you have full control. Here's an example: class Address < Shale::Mapper
attribute :street, Shale::Type::String
attribute :city, Shale::Type::String
end
class Person < Shale::Mapper
attribute :name, Shale::Type::String
attribute :address, Address, default: -> { Address.new }
csv do
map 'name', to: :name
map 'street', using: { from: :street_from_csv, to: :street_to_csv }
map 'city', using: { from: :city_from_csv, to: :city_to_csv }
end
def street_from_csv(model, value)
model.address.street = value
end
def street_to_csv(model, doc)
doc['street'] = model.address.street
end
def city_from_csv(model, value)
model.address.city = value
end
def city_to_csv(model, doc)
doc['city'] = model.address.city
end
end
person = Person.from_csv(<<~DATA)
John Doe,Oxford Street,London
DATA
# => #<Person:0x2a06a
# @name="John Doe"
# @city="London",
# @street="Oxford Street">
person.to_csv
# => "John Doe,Oxford Street,London" |
Thanks for your quick reply! I just edited my initial post again to improve upon it and showing that there is a bit of a problem with nil values in CSV, too.
That is absolutely correct! However, shouldn't default behaviour create a somewhat acceptable and unsurprising result? Just having a JSON string which is created with different rules as the actual
This would actually solve the problem and would be 'good enough'. However, it would create a lot of boilerplate for a lot of people, I guess. |
That actually is documented (https://github.com/kgiszczak/shale#rendering-nil-values). CSV renders nil values by default (as opposed to other formats), because without it, it would mangle the CSV output. E.g. if your
If you really need, you can overwrite this behaviour with explicitly setting
That actually is not a JSON but stringified Ruby Hash, as all formats except XML uses this data structure internally. Underneath person.as_csv
# => { 'first_name' => 'John', `last_name` => 'Doe' }
I totally get it, ang it's not the first time this was raised. I tried to implement a solution for this problem, but it opend a can with worms with so much edge cases that I gave up. I might try to approch this problem again in the future, but there's no a quick solution for this. |
Thanks for your understanding. Can I find your approach back then somewhere? Maybe we can improve upon it, or maybe it is even enough for our use case.
Ah, I see. So this is some kind of fallback in case of “nested CSV”.
You are absolutely right. Not rendering the nil values in a CSV would break the CSV in most cases. I actually suggested this behaviour for the representation of nested data within a CSV. My comment about it showing nil was about that stringified Hash you use internally (which I misunderstood for some wired JSON).
You are right here, too. However, I feel it is a bit inconsistent to have CSV so different behaviour than anything else. Why not make |
I made pretty good progress on this and actually came up with a solution that I like and works really well.
Not really a fallback, more like a side effect of the underlying CSV parser.
Because CSV is much different than JSON, YAML, TOML or XML - in these formats you usually don't want to render empty fields/nodes, in CSV you have to, otherwise you mangle the output. |
Thanks, that would be very good! |
@kgiszczak Just a short reminder. :) |
sorry for the delay, I had a lot of work past few days, but I'm almost done, I need one more day to finish tests and write documentation, it'll be ready after the weekend. |
Hey @tbsvttr the feature is ready https://github.com/kgiszczak/shale#delegating-fields-to-child-attributes |
@kgiszczak Hey, that is very cool! As far as I can see, this solves the problem with relatively little boilerplate code. Can you release this as v0.10? |
I will make a release in a few weeks |
Hi! It's me again!
As already mentioned in the other issue, I am very interested in the CSV serialization capabilities of this (cool!) library.
However, it seems that the CSV serialization is not working in a very useful manner when data is nested.
As an example, given this setup:
These are the results:
Note how the JSON and XML can sensible handle the nested data structure, but the CSV now actually contains a JSON string in a field. My guess is that this will not be acceptable for most scenarios where CSV is actually used. Also note that in this JSON string, the nil values are rendered, while they are not rendered anywhere else. No
render_nil
was used anywhere. This lets me guess that this is somewhat of an unhandled or unintended behaviour.My suggestion would be to flatten the structure so that it will be compatible with CSV. Something like this:
Or with
render_nil: true
:Note that it render an empty string into each field, so that the table which is represented in CSV does maintain is coherence for each row.
With the following headers:
What do you think? Or did I overlook some feature of the library?
The text was updated successfully, but these errors were encountered: