S3

An experimental implementation of S3 in stor

class stor.s3.S3DownloadLogger(total_download_objects)[source]
update_progress(result)[source]

Tracks number of bytes downloaded.

class stor.s3.S3Path(pth)[source]

Provides the ability to manipulate and access S3 resources with a similar interface to the path library.

Right now, the client defaults to Amazon S3 endpoints, but in the near-future, users should be able to custom configure the S3 client.

Note that any S3 object whose name ends with a / is considered to be an empty directory marker. These objects will not be downloaded and instead an empty directory will be created. This follows Amazon’s convention as described in the S3 User Guide.

property bucket

Returns the bucket name from the path or None

property content_type

Get content type for path (using ContentType field from Boto) or empty string.

download(dest, condition=None, use_manifest=False, **kwargs)[source]

Downloads a directory from S3 to a destination directory.

Parameters
  • dest (str) – The destination path to download file to. If downloading to a directory, there must be a trailing slash. The directory will be created if it doesn’t exist.

  • condition (function(results) -> bool) – The method will only return when the results of download matches the condition.

Returns

A list of the downloaded objects.

Return type

List[S3Path]

Notes: - The destination directory will be created automatically if it doesn’t exist. - This method downloads to paths relative to the current directory.

download_object(dest, config=None, **kwargs)[source]

Downloads a file from S3 to a destination file.

Parameters

dest (str) – The destination path to download file to.

Notes

  • The destination directory will be created automatically if it doesn’t exist.

  • This method downloads to paths relative to the current directory.

exists()[source]

Checks existence of the path.

Returns

True if the path exists, False otherwise.

Return type

bool

Raises

RemoteError – A non-404 error occurred.

getsize()[source]

Returns the content length of an object in S3.

Directories and buckets have no length and will return 0.

isdir()[source]

Any S3 object whose name ends with a / is considered to be an empty directory marker. These objects will not be downloaded and instead an empty directory will be created. This follows Amazon’s convention as described in the S3 User Guide.

isfile()[source]

See: os.path.isfile()

list(starts_with=None, limit=None, condition=None, use_manifest=False, list_as_dir=False, ignore_dir_markers=False, **kwargs)[source]

List contents using the resource of the path as a prefix.

Parameters
  • starts_with (str) – Allows for an additional search path to be appended to the current swift path. The current path will be treated as a directory.

  • limit (int) – Limit the amount of results returned.

  • condition (function(results) -> bool) – The method will only return when the results matches the condition.

  • use_manifest (bool) – Perform the list and use the data manfest file to validate the list.

Returns

Every path in the listing

Return type

List[S3Path]

Raises
listdir(**kwargs)[source]

List the path as a dir, returning top-level directories and files.

read_object()[source]

Read an individual object from OBS.

Returns

the raw bytes from the object on OBS.

Return type

bytes

remove()[source]

Removes a single object.

Raises
restore(tier='Bulk', days=10)[source]

Issue a restore command for a single object from glacier.

Parameters
  • tier (str, default 'Bulk') – restore speed (see Glacier docs for details)

  • days (int, default 10) – number of days to keep data in S3 post-restore.

Note

Calling restore() on a directory will not work correctly. Only use this for single objects! (you can always do [s.restore() for s in stor.list(<directory>)])

Ignores RestoreAlreadyInProgressError and AlreadyRestoredError (note that you can’t force S3 to do a faster restore once you’ve chosen a tier)

rmtree()[source]

Removes a resource and all of its contents. The path should point to a directory.

If the specified resource is an object, nothing will happen.

stat()[source]

Performs a stat on the path.

stat only works on paths that are objects. Using stat on a directory of objects will produce a NotFoundError.

An example return dictionary is the following:

{
    'DeleteMarker': True|False,
    'AcceptRanges': 'string',
    'Expiration': 'string',
    'Restore': 'string',
    'LastModified': datetime(2015, 1, 1),
    'ContentLength': 123,
    'ETag': 'string',
    'MissingMeta': 123,
    'VersionId': 'string',
    'CacheControl': 'string',
    'ContentDisposition': 'string',
    'ContentEncoding': 'string',
    'ContentLanguage': 'string',
    'ContentType': 'string',
    'Expires': datetime(2015, 1, 1),
    'WebsiteRedirectLocation': 'string',
    'ServerSideEncryption': 'AES256'|'aws:kms',
    'Metadata': {
        'string': 'string'
    },
    'SSECustomerAlgorithm': 'string',
    'SSECustomerKeyMD5': 'string',
    'SSEKMSKeyId': 'string',
    'StorageClass': 'STANDARD'|'REDUCED_REDUNDANCY'|'STANDARD_IA',
    'RequestCharged': 'requester',
    'ReplicationStatus': 'COMPLETE'|'PENDING'|'FAILED'|'REPLICA'
}
to_url()[source]

Returns HTTP url for object (virtual host-style)

upload(source, condition=None, use_manifest=False, headers=None, **kwargs)[source]

Uploads a list of files and directories to s3.

Note that the S3Path is treated as a directory.

Note that for user-provided OBSUploadObjects, an empty directory’s destination must have a trailing slash.

Parameters
  • source (List[str|OBSUploadObject]) – A list of source files, directories, and OBSUploadObjects to upload to S3.

  • condition (function(results) -> bool) – The method will only return when the results of upload matches the condition.

  • use_manifest (bool) – Generate a data manifest and validate the upload results are in the manifest.

  • headers (dict) – A dictionary of object headers to apply to the object. Headers will not be applied to OBSUploadObjects and any headers specified by an OBSUploadObject will override these headers. Headers should be specified as key-value pairs, e.g. {‘ContentLanguage’: ‘en’}

Returns

A list of the uploaded files as S3Paths.

Return type

List[S3Path]

Notes:

  • This method uploads to paths relative to the current directory.

write_object(content: bytes) → None[source]

Writes an individual object.

Note that this method writes the provided content to a temporary file before uploading.

Parameters

content – raw bytes to write to OBS

class stor.s3.S3UploadLogger(total_upload_objects)[source]
update_progress(result)[source]

Keep track of total uploaded bytes by referencing the object sizes