SDK Quickstart¶

This Quickstart example will walk you through selecting and ordering ARD imagery using the ARD Python SDK.

Step 1: Install tools¶

To install the SDK, run:

pip install max-ard[full]

Windows users should see the main documentation page for information on installing some of the dependencies.

If you are using AWS S3, the SDK uses the Boto3 package to access S3 resources. For SDK functionality that accesses ARD products in S3, you will need to have correctly configured AWS credentials.

Step 2: Get an access token¶

The ARD Python SDK manages access tokens for you. You will need to provide your username and password via a configuration file or with environment variables.

Configuration File¶

You can store your login information locally in a configuration file. Create a text file named .ard-config in your home directory with the following contents:

[ard]
user_name = <your user name>
user_password = <your password>

Note: Your user name is the email address you use with your ARD account.

Step 3: Configure your S3 bucket (AWS only)¶

If you use Azure or Google Cloud, authentication is handled differently and you can skip this step.

You can manually create your S3 bucket and set access via the AWS Console or the AWS CLI tools (described in the API and CLI quickstart sections respectively)

The ARD Python SDK has tools to create and configure an S3 bucket. You should change the name of the bucket below to something unique.

from max_ard.storage import init_bucket
init_bucket('my-ard-bucket-test')

Created bucket "my-ard-bucket-test"
Added ARD writer policy to bucket

Step 4: Select ARD tiles to order¶

The ARD Select system provides advanced searching and selection for ARD data. This document covers the basics of specifying queries and running a Select with the Python SDK. If you already know the acquisition IDs you'd like to order, you can skip down to Step 5.

First we'll start out with two imports - the Select object which you'll use to create and run a Select query, and the FilterFactory object which makes it easier to write queries.

from max_ard import Select

To limit the scope of a Select, it needs to be restricted by a geographic area. You can also include a list of specific acquisition IDs. For this example we'll use a a bounding box over Albuquerque, New Mexico.

We'll also restrict the images to those collected between the dates of 2020-07-01 and 2021-01-25.

bbox = [-106.8, 35.1, -106.4, 35.4]
datetime =  "2020-07-01T00:00:00Z/2021-01-25T00:00:00Z"

In addition to the spatial and temporal restrictions, we would like to limit this Select to image tiles with the following qualities:

collected by Worldview-2
more than 95% cloud free within the Area of Interest (the bbox)
more than 75% of the AOI has valid image pixels

This query is represented in the Select system by the following dictionary:

query = {
        "platform": {
          "eq": "worldview-02"
        },
        "aoi:cloud_free_percentage": {
          "gte": 95
        },
        "aoi:data_percentage": {
          "gte": 75
        }
    }

We'll also add a stack depth of 3. This is how many tiles in each grid cell the Select system will try to return.

The Select system picks the tiles by scoring them and returns the best scoring tiles. The score is calculated based on the recency, with newer images favored over older ones, and how they compare to any filters that compare numerical values ( >, >=, <, <=).

For this example, when we set the cloud-free percentage to 95 (or aoi:cloud_free_percentage >= 95), the Select system will score acquisitions better the greater their cloud free percentage is. This means that an image with a cloud-free percentage of 100 will score better on this metric than an image that is 96% cloud free. An image with 72% cloud free coverage does not satisfy the filter and is rejected.

select = Select(datetime=datetime, bbox=bbox, query=query, stack_depth=3)
select.submit()

Depending on how complicated the query is, the Select service API may be able to respond with an answer, or it may respond with a job number for the user to check. The ARD SDK handles this for you. The wait_for_success() method will poll the API for you until the selection has finished.

select.wait_for_success()

select

<ARD Select 5634875774438159872>

Let's look at some of the results of the Select:

results = select.results
print(f'Found the following acquisition IDs: {results.acquisition_ids}')
print(f'Covering {len(results.stacks)} cells')
print('Ordering this selection will use:')
print(f'{select.usage.area.total_imagery_sqkm} sq.km')

Found the following acquisition IDs: ['10300100B3841C00', '10300100AC94D700', '10300100AA1C6800', '10300100B2B49700', '10300100B39ACD00', '10300100AB101A00', '10300100AD437400', '10300100AC700900', '10300100ACCDAB00', '10300100A9547600', '10300100A9CC9200', '10300100A67D3100']
Covering 45 cells
Ordering this selection will use:
- 704 sqkm of fresh imagery
- 1562 sqkm of standard imagery
- 0 sqkm of training imagery

To order this Select, we'll create an order with the Select ID and a destination. We'll also add an email notification.

Because AWS bucket permissions are set ahead of time, AWS users can use the SDK's shortcut argument of destination to specify the S3 bucket and/or prefix as shown in the first example. For Azure and Google Cloud you'll need to provide an output_config instead.

So you don't accidentally order imagery by running the notebook, we've also added the dry_run keyword which tells the system to validate the order as if it was real, but not start the tile generation pipeline. Note: in this state no emails are sent.

AWS destination:¶

from max_ard import Order

order = Order(select_id=s.select_id, destination = 'my-ard-bucket-test/ABQ', dry_run=True)
order.add_email_notification('me@email.com')

Azure and Google Cloud destinations:¶

Credentials for Azure and Google Cloud are passed as part of the order configuration. See the Azure and Google sections of the Cloud Storage Setup documentation for detailed setup information. In these examples we directly pass the credentials but the ARD API also lets you securely store your credentials for convenient reuse.

You will need to create an appropriate configuration object:

config = {
  "output_config": {
    "azure_blob_storage": {
      "sas_url": "Azure SAS URL string goes here",
      "container": "my-ard-container",
      "prefix": "prefix-1"
    }
  }
}

config = {
  "output_config": {
     "google_cloud_storage": {
       "service_credentials": "... base64-encoded credentials string ...",
       "bucket": "my-ard-bucket",
       "prefix": "prefix-1"
     }
   }
}

Then pass the configuration when initializing the Order object:

from max_ard import Order

order = Order(select_id=s.select_id, output_config=config, dry_run=True)
order.add_email_notification('me@email.com')

If you know the acquisition IDs you would like to order, for example you searched for images on discover.maxar.com, you can order them directly without a Select (AWS destination shown):

acquisition_ids = [
    '<id 1>',
    '<id 2>',
]
order = Order(acquisitions=acquisitions, destination = 'my-ard-bucket-test/ABQ', dry_run=True)
order.add_email('me@email.com')

Then submit your order:

# if you run this without dry_run=True, you will order the tiles!
o.submit()

# if this was not a dry run, you would get an order ID back
# print(f'Order ID: {o.order_id}')

It can take anywhere from 1 to 3 business days to deliver your order. You can check the order status from the order object itself, or using the order ID:

print(o.status)

# Order.from_id(<order id>).state also works

SUCCEEDED

Now let's fast-forward to looking at some delivered ARD data. We'll look at data from a sample order in the S3 location s3://maxar-ard-samples/sample-001/. Since this sample data is in a public bucket, we'll also pass public=True to tell the SDK to skip authentication. If this data was stored in Azure or Google Cloud it would use az:// or gc:// schemes.

from max_ard import ARDCollection

collection = ARDCollection('s3://maxar-ard-samples/v4/sample-001/', public=True)

We can see what acquisitions are stored there:

collection.acquisitions

[<Acquisition of 104001002124FA00 [<ARDTile at Z10-120020223023>]>,
 <Acquisition of 1040010022712A00 [<ARDTile at Z10-120020223023>, <ARDTile at Z10-120020223032>]>,
 <Acquisition of 103001005C2E5E00 [<ARDTile at Z10-120020223023>, <ARDTile at Z10-120020223032>]>,
 <Acquisition of 103001005D31F500 [<ARDTile at Z10-120020223032>]>]

We can also get a summary of dates:

print(collection.dates)
print(collection.start_date)
print(collection.end_date)

['2016-09-22', '2016-09-23', '2016-09-30', '2016-10-08']
2016-09-22
2016-10-08

An ARDCollection gives access to all the tiles too:

collection.tiles

[<ARDTile of 104001002124FA00 at z10-120020223023>,
 <ARDTile of 1040010022712A00 at z10-120020223023>,
 <ARDTile of 103001005C2E5E00 at z10-120020223023>,
 <ARDTile of 1040010022712A00 at z10-120020223032>,
 <ARDTile of 103001005C2E5E00 at z10-120020223032>,
 <ARDTile of 103001005D31F500 at z10-120020223032>]

Taking a look at the first tile, we can see some of its properties:

tile = collection.tiles[0]
print('Tile:', tile)
print('Cell:', tile.cell)
for k,v in tile.properties.items():
    print(f'  {k}: {v}')

Tile: <ARDTile of 104001002124FA00 at z10-120020223023>
Cell: <Cell Z10-120020223023>
  datetime: 2016-09-22 19:22:05Z
  platform: WV03
  gsd: 0.34
  ard_metadata_version: 0.0.1
  catalog_id: 104001002124FA00
  utm_zone: 10
  quadkey: 120020223023
  view:off_nadir: 20.1
  view:azimuth: 191.0
  view:incidence_angle: 67.9
  view:sun_azimuth: 163.8
  view:sun_elevation: 51.2
  proj:epsg: 32610
  proj:geometry: {'type': 'Polygon', 'coordinates': [[[544843.75, 4185156.25], [544843.75, 4179843.75], [550156.25, 4179843.75], [550156.25, 4185156.25], [544843.75, 4185156.25]]]}
  proj:bbox: [544843.75, 4179843.75, 550156.25, 4185156.25]
  tile:data_area: 28.2
  tile:clouds_area: 0.0
  tile:clouds_percent: 0

Finally, let's plot a thumbnail of the 'visual' asset. Because ARD tiles are cloud-optimized GeoTIFFs, we can read from the overview to get a low resolution representation for our thumbnail:

%matplotlib inline
from rasterio.plot import show
with tile.open_asset('visual') as src:
    # setting the output shape to a smaller size makes rasterio read overview instead of full resolution data
    arr = src.read(out_shape=(3, int(src.height / 64), int(src.width / 64)))
    show(arr)

png

For more information on using the Python SDK, see the SDK documentation.