Skip to content
Snippets Groups Projects
Commit 97399cf9 authored by Régis Behmo's avatar Régis Behmo
Browse files

Fix TypeError during transcript upload to S3

On a platform that is configured to upload video transcripts to S3
(`DEFAULT_FILE_STORAGE = "storages.backends.s3boto3.S3Boto3Storage"`),
uploads from the studio fail with a TypeError: "Unicode-objects must be
encoded before hashing"

A full stacktrace of the issue can be found here:
https://sentry.overhang.io/share/issue/2249b6f67d794c7e986cc288758f4ebe/

This error is triggered by md5 hashing in the botocore library, which
itself is used by the S3Boto3Storage storage class. This error does not
occur with filesystem-based uploads because it does not perform checksum
verification. The reason why this error would not occur on edx.org is
unknown. Similar issues were already fixed from edxval.

To address this issue, we encode the transcript file content prior to
sending it to s3.
parent ed1156a6
No related branches found
No related tags found
No related merge requests found
......@@ -233,7 +233,7 @@ def transcript_upload_handler(request):
content=transcript_file.read().decode('utf-8'),
input_format=Transcript.SRT,
output_format=Transcript.SJSON
)
).encode()
create_or_update_video_transcript(
video_id=edx_video_id,
language_code=language_code,
......
......@@ -109,7 +109,7 @@ def save_video_transcript(edx_video_id, input_format, transcript_content, langua
content=transcript_content,
input_format=input_format,
output_format=Transcript.SJSON
)
).encode()
create_or_update_video_transcript(
video_id=edx_video_id,
language_code=language_code,
......@@ -222,7 +222,7 @@ def upload_transcripts(request):
content=transcript_file.read().decode('utf-8'),
input_format=Transcript.SRT,
output_format=Transcript.SJSON
)
).encode()
transcript_created = create_or_update_video_transcript(
video_id=edx_video_id,
language_code=u'en',
......
......@@ -509,7 +509,7 @@ class VideoStudioViewHandlers(object):
content=transcript_file.read().decode('utf-8'),
input_format=Transcript.SRT,
output_format=Transcript.SJSON
)
).encode()
create_or_update_video_transcript(
video_id=edx_video_id,
language_code=language_code,
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment