Skip to content

Conversation

@smn
Copy link
Member

@smn smn commented Aug 8, 2014

It is possible for MNOs to give us messages that are in an encoding that aren't in the standard encodings that Python ships with.

We have two options:

  1. Register a new codec with the codec registry.
  2. Create a pluggable codec class for SMPP that fallbacks to python's codec registry for things it knows about but provides hooks for providing other encodings.

We've chosen to go with option 2 because adding a codec to the registry introduces all sorts of potential code loading race conditions.

@smn
Copy link
Member Author

smn commented Aug 8, 2014

This would solve #338 for SMPP.

@smn
Copy link
Member Author

smn commented Aug 12, 2014

@rudigiesler @justinvdm @hodgestar can I get a review?

@rudigiesler
Copy link
Contributor

👍, but I don't know the Vumi code-base well enough to know if this might mess with something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We implemented a ucs2 codec in this PR that proxies to the utf-16be codec. Should we maybe refer to it here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my thinking was to have the map in the DeliveryShortMessageProcessor be only codecs that already exist in the standard python distribution. The configurable codec_class can override some of these if it wants, which it does with the ucs2 implementation it provides.

I'm ±0 on this really, it seemed a sensible thing to do yesterday but happy to stick in a custom codec here as well.

@hodgestar thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, that works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think sticking to the built-in codecs by default makes sense for now -- we can shift things around easily later.

@justinvdm
Copy link
Contributor

Minor comment, otherwise looks good.

@justinvdm
Copy link
Contributor

👍

@smn
Copy link
Member Author

smn commented Aug 12, 2014

@rudigiesler thanks for the review, I don't expect you to know everything or be catching problems. The best way to learn is by reading PRs (which is why we're constantly asking you to review stuff).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem like performance on this would be terrible. Should we construct a reverse mapping?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, let me start on that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\o/

@hodgestar
Copy link
Contributor

I left a bunch of questions but they're mostly just to check my understanding of the changes.

@hodgestar
Copy link
Contributor

I'm wondering whether we should land this and then sort out the 7-bit packing in a separate PR? We should start that PR by adding an integration test that sends in a 7-bit packed PDU and checks that we handle it correctly?

👍 on this PR landing once we have the ticket for the new one.

@smn
Copy link
Member Author

smn commented Aug 14, 2014

@hodgestar GSM7Bit now returns bytestrings, I'm reasonably convinced that I saw that that is what we were receiving anyway.

Also added handling of errors kwarg somewhat properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this perhaps need to be return self.gsm_basic_charset_map.get('?')?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should have tests for the error cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@smn
Copy link
Member Author

smn commented Aug 14, 2014

@hodgestar ready for rereview

@smn
Copy link
Member Author

smn commented Aug 14, 2014

@rudigiesler @justinvdm also again ready for re-revieww

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed another issue here -- we call the same error handlers for both encoding and decoding, but I don't think that makes sense:

  • For decoding, handle_replace_error needs to return u'?'.
  • For decoding and encoding, handle_strict_error should raise UnicodeDecodeError or UnicodeEncodeError as appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a feeling this was a bad idea to begin with.

@smn
Copy link
Member Author

smn commented Aug 14, 2014

@hodgestar ok, again :) Not entirely happy with some of the duplication but it's not too bad.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should check that UnicodeEncodeError is raised (and the equivalent decode test should have a similar change).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, done.

@hodgestar
Copy link
Contributor

Other than one small comment, looking good.

@hodgestar
Copy link
Contributor

👍 as soon as a Travis build passes (looks like they're building now).

@smn smn merged commit 7a86f39 into develop Aug 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants