Summary: | Move caps hash stuff to Wocky | ||
---|---|---|---|
Product: | Telepathy | Reporter: | Jonny Lamb <jonny.lamb> |
Component: | gabble | Assignee: | Telepathy bugs list <telepathy-bugs> |
Status: | RESOLVED FIXED | QA Contact: | Telepathy bugs list <telepathy-bugs> |
Severity: | normal | ||
Priority: | medium | Keywords: | patch |
Version: | git master | ||
Hardware: | Other | ||
OS: | All | ||
URL: | http://cgit.freedesktop.org/~jonny/wocky/log?h=caps-hash | ||
Whiteboard: | review+ | ||
i915 platform: | i915 features: |
Description
Jonny Lamb
2011-01-24 05:02:21 UTC
I think WockyDiscoIdentity should probably be a boxed type. http://git.collabora.co.uk/?p=user/jonny/telepathy-gabble.git;a=shortlog;h=refs/heads/caps-hash This gabble branch is also interesting. It's actually not entirely clear to me that there's any value in WockyDiscoIdentity having accessors for all its fields, rather than the caller just using identity->type and friends. 197 if (!source) 198 return NULL; g_return_val_if_fail (source != NULL, NULL); 226 if (!arr) 227 return; if (arr == NULL) return; - * caps-hash.c - Computing verification string hash (XEP-0115 v1.5) - * Copyright (C) 2008 Collabora Ltd. + * wocky-caps-hash.c - Computing verification string hash (XEP-0115 v1.5) + * Copyright (C) 2008-2010 Collabora Ltd. It's 2011. On top of _compute_from_node(), shouldn't there be a function that computes a hash from a list of features, identities, and data forms? Otherwise Gabble/Salut will need to build up a WockyNode whenever they need to hash their caps… oh! they have to do this anyway, I suppose maybe? +#ifndef __WOCKY_PLUGINS_DISCO_IDENTITY_H__ o rly Technically we should not be using __-prefixed guards. It should really be WOCKY_PLUGINS_DISCO_IDENTITY_H. But this is pedantry. static gint identity_cmp (gconstpointer a, gconstpointer b) I think the guts of this function should be wocky_disco_identity_cmp (WockyDiscoIdentity *a, WockyDiscoIdentity *b). But obviously you'll still need the wrapper for GPtrArray silliness. * Computes verification string hash according to XEP-0115 v1.5 Hashes plural. + * + * Wocky does not do anything with dataforms (XEP-0128) included in + * capabilities. However, it needs to parse them in order to compute the hash + * according to XEP-0115. I don't think this is a particularly useful comment, honestly. The “does not do anything with” bit is liable to become wrong. I imagine we'll ultimately have something like: WockyCapabilities *wocky_capabilities_from_node (WockyNode *); /* accessors for identities, features, and forms */ gchar *wocky_capabilities_calculate_hash (WockyCapabilities *, GChecksumType); You updated the hashing test in Gabble. That test should be in Wocky if the hashing has moved there. I guess the Gabble branch isn't finished if it still has all the hash computation stuff. :) (In reply to comment #1) > I think WockyDiscoIdentity should probably be a boxed type. Done. (In reply to comment #5) > You updated the hashing test in Gabble. That test should be in Wocky if the > hashing has moved there. Done. (In reply to comment #0) > 12:57 < wjt> jonnylamb: the data form parser should use WockyDataForm > 12:57 < wjt> s/use/be replaced by using/ Done. (In reply to comment #3) > It's actually not entirely clear to me that there's any value in > WockyDiscoIdentity having accessors for all its fields, rather than the caller > just using identity->type and friends. Okay, sure, I just copied them from gabble. They're gone. > g_return_val_if_fail (source != NULL, NULL); Done. > if (arr == NULL) > return; Done. > It's 2011. You're right! > On top of _compute_from_node(), shouldn't there be a function that computes a > hash from a list of features, identities, and data forms? Otherwise > Gabble/Salut will need to build up a WockyNode whenever they need to hash their > caps… oh! they have to do this anyway, I suppose maybe? Done. (In reply to comment #4) > +#ifndef __WOCKY_PLUGINS_DISCO_IDENTITY_H__ Fixed. > I think the guts of this function should be wocky_disco_identity_cmp > (WockyDiscoIdentity *a, WockyDiscoIdentity *b). But obviously you'll still need > the wrapper for GPtrArray silliness. Done. > * Computes verification string hash according to XEP-0115 v1.5 > > Hashes plural. Fixed. > I don't think this is a particularly useful comment, honestly. The “does not do > anything with” bit is liable to become wrong. I imagine we'll ultimately have > something like: You're the boss, removed. Okay that's that. You're free to take another swing. Salut's branch is here: http://git.collabora.co.uk/?p=user/jonny/telepathy-salut.git;a=shortlog;h=refs/heads/caps “node: add attribute build tag for wocky_stanza_build”: I think you meant to say “xml:lang attribute build tag” but I'll let it slide. + GSList *l; + + for (l = node->children; l != NULL; l = l->next) + { + WockyNode *child = l->data; + + if (!wocky_strdiff (child->name, "value")) + { + if (type == WOCKY_DATA_FORM_FIELD_TYPE_TEXT_SINGLE) + { + type = WOCKY_DATA_FORM_FIELD_TYPE_TEXT_MULTI; + break; + } + else + { + type = WOCKY_DATA_FORM_FIELD_TYPE_TEXT_SINGLE; + } + } + } type = WOCKY_DATA_FORM_FIELD_TYPE_TEXT_SINGLE; WockyNodeIter iter; wocky_node_iter_init (&iter, node, "value"); if (wocky_node_iter_next (&iter, NULL) && wocky_node_iter_next (&iter, NULL)) type = WOCKY_DATA_FORM_FIELD_TYPE_TEXT_MULTI; ? The patch as it stands would leave type = 0 in some cases? + field = g_hash_table_lookup (dataform->fields, "FORM_TYPE"); + g_assert (field != NULL); I don't see any reason why this assertion is valid, given that this is data received over the network. - g_ptr_array_sort (form->fields, fields_cmp); - g_ptr_array_sort (field->values, char_cmp); I don't see anything sorting the fields or their values in the new code. The XEP says: 1. Sort the fields by the value of the "var" attribute. Ah you fixed this in And: 1. Append the value of the "var" attribute, followed by the '<' character. 2. Sort values by the XML character data of the <value/> element. 3. For each <value/> element, append the XML character data, followed by the '<' character. Please add a test case which hashes two forms which differ only in the order of their values. Also how does this deal with forms with fields of types other than text-single or text-multi? - if (!source) - return NULL; + g_return_val_if_fail (source != NULL, NULL); - if (!arr) + if (arr == NULL) Why does copy critical, but free merely early-return? + dataform = wocky_data_form_new_from_form (node, &error); + if (error != NULL) + { + DEBUG ("Failed to parse data form: %s\n", error->message); + g_clear_error (&error); } We should fail the entire hashing process if the form is malformed, or we'll get confused. *Particularly* if it's our own form that's malformed. + return strcmp (g_value_get_string (left_type->default_value), + g_value_get_string (left_type->default_value)); One of those should be 'right_type'. Also this will crash if someone specifies >1 value for FORM_TYPE. I suppose parsing the form should fail in that case. It'd be nice to have wocky_data_form_get_form_type (). Fixed in a later commit, I see. (In reply to comment #8) > ? The patch as it stands would leave type = 0 in some cases? I'd changed it to do the same thing as you did later but not as nicely, I'm now using the node iter. > I don't see any reason why this assertion is valid, given that this is data > received over the network. Fair, fixed. > 1. Append the value of the "var" attribute, followed by the '<' > character. > 2. Sort values by the XML character data of the <value/> element. > 3. For each <value/> element, append the XML character data, > followed by the '<' character. > > Please add a test case which hashes two forms which differ only in the order of > their values. There is a test for testing sorting dataforms and not only are the forms are in a different order but the values are jumbled up too. I don't see how a test that just jumbles up value ordering would give any more? > Also how does this deal with forms with fields of types other than text-single > or text-multi? Well, it won't include the field in the hash to SHA. I've included a debug message now. > Why does copy critical, but free merely early-return? Well it's like g_free()? I guess not as g_free() early returns too. Okay, I'll do the same here. > We should fail the entire hashing process if the form is malformed, or we'll > get confused. *Particularly* if it's our own form that's malformed. Fixed. Check out my branch, homey. (In reply to comment #9) > There is a test for testing sorting dataforms and not only are the forms are in > a different order but the values are jumbled up too. I don't see how a test > that just jumbles up value ordering would give any more? Okay, reasonable. > (In reply to comment #8) > > Also how does this deal with forms with fields of types other than text-single > > or text-multi? > > Well, it won't include the field in the hash to SHA. I've included a debug > message now. I don't think this is right, I'm afraid. The XEP says: • For each field other than FORM_TYPE: ‣ Append the value of the "var" attribute, followed by the '<' character. ‣ Sort values by the XML character data of the <value/> element. ‣ For each <value/> element, append the XML character data, followed by the '<' character. If WockyDataForm's API doesn't make this possible without contortions, we should make it possible, rather than risking incorrectly calculating the hash (despite having had all the necessary information at some point). field = g_hash_table_lookup (dataform->fields, "FORM_TYPE"); - g_assert (field != NULL); + + if (field == NULL) + continue; I think this is wrong, too. We should fail to calculate the hash in this case. Arguably parsing the data form should fail if there's no FORM_TYPE field, but XEP-0004 doesn't actually mandate that that field exist, afaict. I think there should be a test for the cases where a form has no FORM_TYPE, and where the disco reply contains two forms with the same FORM_TYPE; in both cases, calculating the hash should fail. We should not cache the supposed hash negatively, though: otherwise, malicious contacts could trick us into negatively caching a hash that some other client is using correctly and legitimately. I think it's reasonable for Gabble to still use the capabilities, though. From jdev@: > Zash: variations of empty fields isn't specified well iirc > Zash: eg <field><value/></field> vs <field/> vs no field at all Might be worth testing the first two variants do something sensible. + WockyStanza * stanza = wocky_stanza_build (WOCKY_STANZA_TYPE_IQ, What's going on ^^ here? (In reply to comment #10) > I don't think this is right, I'm afraid. The XEP says: > > • For each field other than FORM_TYPE: > ‣ Append the value of the "var" attribute, followed by the '<' > character. > ‣ Sort values by the XML character data of the <value/> element. > ‣ For each <value/> element, append the XML character data, > followed by the '<' character. > > If WockyDataForm's API doesn't make this possible without contortions, we > should make it possible, rather than risking incorrectly calculating the hash > (despite having had all the necessary information at some point). Okay, so I thought it would be difficult to start supporting all the other field types, but it turned out to be fine. Fixed and added a test for booleans. One problem I see might be an issue here is that the data form parser treats both "true" and "1" as value booleans for TRUE (and the opposite for FALSE), but the caps hashing code only gets a GValue back and uses 1 and 0 for TRUE and FALSE respectively. We don't get the actual stanza from the data form field so can't check the actual value. Also, it switches on the WockyDataFormFieldType value. I'm wondering if a good default value is to act as if it's text-single? What do you think? > I think this is wrong, too. We should fail to calculate the hash in this > case. Arguably parsing the data form should fail if there's no FORM_TYPE > field, but XEP-0004 doesn't actually mandate that that field exist, > afaict. > > I think there should be a test for the cases where a form has no > FORM_TYPE, Done. > ...and where the disco reply contains two forms with the same > FORM_TYPE; in both cases, calculating the hash should fail. Done. > We should not cache the supposed hash negatively, though: otherwise, > malicious contacts could trick us into negatively caching a hash that > some other client is using correctly and legitimately. I think it's > reasonable for Gabble to still use the capabilities, though. When gabble gets a reply from a disco request that doesn't match the hash we originally received (including us not being able to hash it ourselves) it sets the trust of the hash back to zero but doesn't appear to do much more than that? > From jdev@: > > Zash: variations of empty fields isn't specified well iirc > > Zash: eg <field><value/></field> vs <field/> vs no field at all > > Might be worth testing the first two variants do something sensible. Okay I added tests for <field var='lol'><value/></field>, <field var='lol'/> and <field/>. The first two fail because of the presence of the var attribute, but the last one fails because the data form parser just ignores fields with no var attribute. > + WockyStanza * stanza = wocky_stanza_build (WOCKY_STANZA_TYPE_IQ, > > What's going on ^^ here? Fixed. Check out my updated branch. I was sure I'd replied to this. But apparently my reply got lost. (In reply to comment #11) > (In reply to comment #10) > > I don't think this is right, I'm afraid. The XEP says: > > > > • For each field other than FORM_TYPE: > > ‣ Append the value of the "var" attribute, followed by the '<' > > character. > > ‣ Sort values by the XML character data of the <value/> element. > > ‣ For each <value/> element, append the XML character data, > > followed by the '<' character. > > > > If WockyDataForm's API doesn't make this possible without contortions, we > > should make it possible, rather than risking incorrectly calculating the hash > > (despite having had all the necessary information at some point). > > Okay, so I thought it would be difficult to start supporting all the other > field types, but it turned out to be fine. Fixed and added a test for booleans. > > One problem I see might be an issue here is that the data form parser treats > both "true" and "1" as value booleans for TRUE (and the opposite for FALSE), > but the caps hashing code only gets a GValue back and uses 1 and 0 for TRUE and > FALSE respectively. We don't get the actual stanza from the data form field so > can't check the actual value. Then WockyDataForm should be fixed to give you the actual values. http://cgit.freedesktop.org/~jonny/wocky/commit/?h=caps-hash&id=9d50b575a41fc5e3648023fbae372c06168ce983 I wouldn't do this like this. I'd make the loop that iterates the forms keep a set of FORM_TYPEs that it has already seen. Given that that loop already has code to bail out if the FORM_TYPE is missing, I don't see why this related validation happens ahead of time. (In reply to comment #12) > Then WockyDataForm should be fixed to give you the actual values. I've done something which fixes this, although perhaps not the best way ever? Thoughts? (In reply to comment #13) > I wouldn't do this like this. I'd make the loop that iterates the forms keep a > set of FORM_TYPEs that it has already seen. Given that that loop already has > code to bail out if the FORM_TYPE is missing, I don't see why this related > validation happens ahead of time. Okay, it now does the checking during. Branch updated. wjt: don't only save the original strings for booleans wjt: *always* save them all wjt: and then you can make all this code much easier wjt: the top patch [caps-hash: check whether there are multiple forms with the same name during, not before] is fine. (In reply to comment #15) > wjt: don't only save the original strings for booleans > wjt: *always* save them all > wjt: and then you can make all this code much easier le done. Also, I'm not so happy about "default_value_str". I'm open to suggestions. I meant, keep a list of the strings inside each <value> for *every* field type, not just booleans. The algorithm specified in XEP-0115 does not say anything at all about field types, so nor should our implementation of it. Argh. I somehow missed the last patch that actually did exactly this. Sorry for jumping the gun. (In reply to comment #17) > Also, I'm not so happy about "default_value_str". I'm open to suggestions. 'raw_value_contents', maybe? I think it would be better if the filling in of default_value_str when parsing the form happened before the type stuff: always call extract_value_list, stash it in default_value_str, and pass that into get_field_value() maybe? This would mean that we'd not risk accidentally ignoring some spurious extra <value> elements when calculating a hash. But functionally it would only change the behaviour for malformed forms; so maybe I should let it slide. The Salut branch looks good (obvs. predicated on the Wocky branch). (In reply to comment #19) > Argh. I somehow missed the last patch that actually did exactly this. Sorry for > jumping the gun. Words cannot express my anger. > 'raw_value_contents', maybe? OK, changed. > But functionally it would only change the behaviour for malformed forms; so > maybe I should let it slide. Good point. cork them up in a bottle and drop them in the ocean Cool, thanks. Merged wocky and salut branches. Did you look at the gabble branch? (In reply to comment #23) > Cool, thanks. Merged wocky and salut branches. Did you look at the gabble > branch? I can't help but think that, as long as all its test cases are in Wocky, Gabble's test-caps-hash.c could be expunged from the record. But that branch looks fine for Gabble master. (Not the stable branch.) (In reply to comment #24) > I can't help but think that, as long as all its test cases are in Wocky, > Gabble's test-caps-hash.c could be expunged from the record. Done. > But that branch looks fine for Gabble master. (Not the stable branch.) Coooool. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.