基于日期、时间、经纬度下载MODIS数据并批处理

2023-11-08 09:01

本文主要是介绍基于日期、时间、经纬度下载MODIS数据并批处理,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、利用python基于日期、时间和经纬度批量下载MODIS数据

我想根据一些实测点下载对应时间和位置的MODIS数据(5min一景的产品)作为对比。

之前想了很多种方法,比如基于GEE什么的,但是我下载的MODIS产品在GEE上没有。

于是后来考虑可以用这个网站

| National Snow and Ice Data Center

这个网站可以搜索比如MYD29是我需要下载的产品,然后搜索MOD29 download就可以进入

MODIS/Terra Sea Ice Extent 5-Min L2 Swath 1km, Version 61 | National Snow and Ice Data Center

这里有很多下载的方式,选第二个Data Access Tool,get data.

然后可以用其中的一条实测点数据,输入到左边的输入框中,这样右边就有数据了,然后点击

Download Script就可以下载到下载这个条件数据的python代码,以下是代码,修改当中函数的username和password即可,然后再修改main函数,设定不同的bounding box(位置)和time什么的按照你的要求搜索并下载数据就行啦。

但是这个程序经常会因为网络不稳定而断掉,所以可能需要自己重启,或者再exception里面修改进行重启。但是值得一提的是,global 变量并不是在函数里修改然后再次进入这个函数就可以接着运行的,必须要重新传入(如果有修改的话),否在就重启后还是最上面的那个值。

#!/usr/bin/env python
# ----------------------------------------------------------------------------
# NSIDC Data Download Script
#
# Copyright (c) 2023 Regents of the University of Colorado
# Permission is hereby granted, free of charge, to any person obtaining
# a copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# Tested in Python 2.7 and Python 3.4, 3.6, 3.7, 3.8, 3.9
#
# To run the script at a Linux, macOS, or Cygwin command-line terminal:
#   $ python nsidc-data-download.py
#
# On Windows, open Start menu -> Run and type cmd. Then type:
#     python nsidc-data-download.py
#
# The script will first search Earthdata for all matching files.
# You will then be prompted for your Earthdata username/password
# and the script will download the matching files.
#
# If you wish, you may store your Earthdata username/password in a .netrc
# file in your $HOME directory and the script will automatically attempt to
# read this file. The .netrc file should have the following format:
#    machine urs.earthdata.nasa.gov login MYUSERNAME password MYPASSWORD
# where 'MYUSERNAME' and 'MYPASSWORD' are your Earthdata credentials.
#
# Instead of a username/password, you may use an Earthdata bearer token.
# To construct a bearer token, log into Earthdata and choose "Generate Token".
# To use the token, when the script prompts for your username,
# just press Return (Enter). You will then be prompted for your token.
# You can store your bearer token in the .netrc file in the following format:
#    machine urs.earthdata.nasa.gov login token password MYBEARERTOKEN
# where 'MYBEARERTOKEN' is your Earthdata bearer token.
#
from __future__ import print_functionimport base64
import getopt
import itertools
import json
import math
import netrc
import os.path
import ssl
import sys
import time
from getpass import getpasstry:from urllib.parse import urlparsefrom urllib.request import urlopen, Request, build_opener, HTTPCookieProcessorfrom urllib.error import HTTPError, URLError
except ImportError:from urlparse import urlparsefrom urllib2 import urlopen, Request, HTTPError, URLError, build_opener, HTTPCookieProcessorshort_name = 'MYD29'
version = '61'
time_start = '2002-07-04T00:00:00Z'
time_end = '2023-11-07T04:01:18Z'
bounding_box = ''
polygon = ''
filename_filter = '*MYD29.A2020001.1855.061.2020321085433*'
url_list = []CMR_URL = 'https://cmr.earthdata.nasa.gov'
URS_URL = 'https://urs.earthdata.nasa.gov'
CMR_PAGE_SIZE = 2000
CMR_FILE_URL = ('{0}/search/granules.json?provider=NSIDC_ECS''&sort_key[]=start_date&sort_key[]=producer_granule_id''&scroll=true&page_size={1}'.format(CMR_URL, CMR_PAGE_SIZE))def get_username():username = ''# For Python 2/3 compatibility:try:do_input = raw_input  # noqaexcept NameError:do_input = inputusername = do_input('Earthdata username (or press Return to use a bearer token): ')return usernamedef get_password():password = ''while not password:password = getpass('password: ')return passworddef get_token():token = ''while not token:token = getpass('bearer token: ')return tokendef get_login_credentials():"""Get user credentials from .netrc or prompt for input."""credentials = Nonetoken = Nonetry:info = netrc.netrc()username, account, password = info.authenticators(urlparse(URS_URL).hostname)if username == 'token':token = passwordelse:credentials = '{0}:{1}'.format(username, password)credentials = base64.b64encode(credentials.encode('ascii')).decode('ascii')except Exception:username = Nonepassword = Noneif not username:username = get_username()if len(username):password = get_password()credentials = '{0}:{1}'.format(username, password)credentials = base64.b64encode(credentials.encode('ascii')).decode('ascii')else:token = get_token()return credentials, tokendef build_version_query_params(version):desired_pad_length = 3if len(version) > desired_pad_length:print('Version string too long: "{0}"'.format(version))quit()version = str(int(version))  # Strip off any leading zerosquery_params = ''while len(version) <= desired_pad_length:padded_version = version.zfill(desired_pad_length)query_params += '&version={0}'.format(padded_version)desired_pad_length -= 1return query_paramsdef filter_add_wildcards(filter):if not filter.startswith('*'):filter = '*' + filterif not filter.endswith('*'):filter = filter + '*'return filterdef build_filename_filter(filename_filter):filters = filename_filter.split(',')result = '&options[producer_granule_id][pattern]=true'for filter in filters:result += '&producer_granule_id[]=' + filter_add_wildcards(filter)return resultdef build_cmr_query_url(short_name, version, time_start, time_end,bounding_box=None, polygon=None,filename_filter=None):params = '&short_name={0}'.format(short_name)params += build_version_query_params(version)params += '&temporal[]={0},{1}'.format(time_start, time_end)if polygon:params += '&polygon={0}'.format(polygon)elif bounding_box:params += '&bounding_box={0}'.format(bounding_box)if filename_filter:params += build_filename_filter(filename_filter)return CMR_FILE_URL + paramsdef get_speed(time_elapsed, chunk_size):if time_elapsed <= 0:return ''speed = chunk_size / time_elapsedif speed <= 0:speed = 1size_name = ('', 'k', 'M', 'G', 'T', 'P', 'E', 'Z', 'Y')i = int(math.floor(math.log(speed, 1000)))p = math.pow(1000, i)return '{0:.1f}{1}B/s'.format(speed / p, size_name[i])def output_progress(count, total, status='', bar_len=60):if total <= 0:returnfraction = min(max(count / float(total), 0), 1)filled_len = int(round(bar_len * fraction))percents = int(round(100.0 * fraction))bar = '=' * filled_len + ' ' * (bar_len - filled_len)fmt = '  [{0}] {1:3d}%  {2}   '.format(bar, percents, status)print('\b' * (len(fmt) + 4), end='')  # clears the linesys.stdout.write(fmt)sys.stdout.flush()def cmr_read_in_chunks(file_object, chunk_size=1024 * 1024):"""Read a file in chunks using a generator. Default chunk size: 1Mb."""while True:data = file_object.read(chunk_size)if not data:breakyield datadef get_login_response(url, credentials, token):opener = build_opener(HTTPCookieProcessor())req = Request(url)if token:req.add_header('Authorization', 'Bearer {0}'.format(token))elif credentials:try:response = opener.open(req)# We have a redirect URL - try again with authorization.url = response.urlexcept HTTPError:# No redirect - just try again with authorization.passexcept Exception as e:print('Error{0}: {1}'.format(type(e), str(e)))sys.exit(1)req = Request(url)req.add_header('Authorization', 'Basic {0}'.format(credentials))try:response = opener.open(req)except HTTPError as e:err = 'HTTP error {0}, {1}'.format(e.code, e.reason)if 'Unauthorized' in e.reason:if token:err += ': Check your bearer token'else:err += ': Check your username and password'print(err)sys.exit(1)except Exception as e:print('Error{0}: {1}'.format(type(e), str(e)))sys.exit(1)return responsedef cmr_download(urls, force=False, quiet=False):"""Download files from list of urls."""if not urls:returnurl_count = len(urls)if not quiet:print('Downloading {0} files...'.format(url_count))credentials = Nonetoken = Nonefor index, url in enumerate(urls, start=1):if not credentials and not token:p = urlparse(url)if p.scheme == 'https':credentials, token = get_login_credentials()filename = url.split('/')[-1]if not quiet:print('{0}/{1}: {2}'.format(str(index).zfill(len(str(url_count))),url_count, filename))try:response = get_login_response(url, credentials, token)length = int(response.headers['content-length'])try:if not force and length == os.path.getsize(filename):if not quiet:print('  File exists, skipping')continueexcept OSError:passcount = 0chunk_size = min(max(length, 1), 1024 * 1024)max_chunks = int(math.ceil(length / chunk_size))time_initial = time.time()with open(filename, 'wb') as out_file:for data in cmr_read_in_chunks(response, chunk_size=chunk_size):out_file.write(data)if not quiet:count = count + 1time_elapsed = time.time() - time_initialdownload_speed = get_speed(time_elapsed, count * chunk_size)output_progress(count, max_chunks, status=download_speed)if not quiet:print()except HTTPError as e:print('HTTP error {0}, {1}'.format(e.code, e.reason))except URLError as e:print('URL error: {0}'.format(e.reason))except IOError:raisedef cmr_filter_urls(search_results):"""Select only the desired data files from CMR response."""if 'feed' not in search_results or 'entry' not in search_results['feed']:return []entries = [e['links']for e in search_results['feed']['entry']if 'links' in e]# Flatten "entries" to a simple list of linkslinks = list(itertools.chain(*entries))urls = []unique_filenames = set()for link in links:if 'href' not in link:# Exclude links with nothing to downloadcontinueif 'inherited' in link and link['inherited'] is True:# Why are we excluding these links?continueif 'rel' in link and 'data#' not in link['rel']:# Exclude links which are not classified by CMR as "data" or "metadata"continueif 'title' in link and 'opendap' in link['title'].lower():# Exclude OPeNDAP links--they are responsible for many duplicates# This is a hack; when the metadata is updated to properly identify# non-datapool links, we should be able to do this in a non-hack waycontinuefilename = link['href'].split('/')[-1]if filename in unique_filenames:# Exclude links with duplicate filenames (they would overwrite)continueunique_filenames.add(filename)urls.append(link['href'])return urlsdef cmr_search(short_name, version, time_start, time_end,bounding_box='', polygon='', filename_filter='', quiet=False):"""Perform a scrolling CMR query for files matching input criteria."""cmr_query_url = build_cmr_query_url(short_name=short_name, version=version,time_start=time_start, time_end=time_end,bounding_box=bounding_box,polygon=polygon, filename_filter=filename_filter)if not quiet:print('Querying for data:\n\t{0}\n'.format(cmr_query_url))cmr_scroll_id = Nonectx = ssl.create_default_context()ctx.check_hostname = Falsectx.verify_mode = ssl.CERT_NONEurls = []hits = 0while True:req = Request(cmr_query_url)if cmr_scroll_id:req.add_header('cmr-scroll-id', cmr_scroll_id)try:response = urlopen(req, context=ctx)except Exception as e:print('Error: ' + str(e))sys.exit(1)if not cmr_scroll_id:# Python 2 and 3 have different case for the http headersheaders = {k.lower(): v for k, v in dict(response.info()).items()}cmr_scroll_id = headers['cmr-scroll-id']hits = int(headers['cmr-hits'])if not quiet:if hits > 0:print('Found {0} matches.'.format(hits))else:print('Found no matches.')search_page = response.read()search_page = json.loads(search_page.decode('utf-8'))url_scroll_results = cmr_filter_urls(search_page)if not url_scroll_results:breakif not quiet and hits > CMR_PAGE_SIZE:print('.', end='')sys.stdout.flush()urls += url_scroll_resultsif not quiet and hits > CMR_PAGE_SIZE:print()return urlsdef main(argv=None):global short_name, version, time_start, time_end, bounding_box, \polygon, filename_filter, url_listif argv is None:argv = sys.argv[1:]force = Falsequiet = Falseusage = 'usage: nsidc-download_***.py [--help, -h] [--force, -f] [--quiet, -q]'try:opts, args = getopt.getopt(argv, 'hfq', ['help', 'force', 'quiet'])for opt, _arg in opts:if opt in ('-f', '--force'):force = Trueelif opt in ('-q', '--quiet'):quiet = Trueelif opt in ('-h', '--help'):print(usage)sys.exit(0)except getopt.GetoptError as e:print(e.args[0])print(usage)sys.exit(1)# Supply some default search parameters, just for testing purposes.# These are only used if the parameters aren't filled in up above.if 'short_name' in short_name:short_name = 'ATL06'version = '003'time_start = '2018-10-14T00:00:00Z'time_end = '2021-01-08T21:48:13Z'bounding_box = ''polygon = ''filename_filter = '*ATL06_2020111121*'url_list = []try:if not url_list:url_list = cmr_search(short_name, version, time_start, time_end,bounding_box=bounding_box, polygon=polygon,filename_filter=filename_filter, quiet=quiet)cmr_download(url_list, force=force, quiet=quiet)except KeyboardInterrupt:quit()if __name__ == '__main__':main()

二、批量处理MODIS数据

这个用HEG这个即可,就只需要选择input-file,然后选择第二个,输入一个hdf文件,然后下面各种参数调整,最后选择batch run就行,他就会把文件夹里的MODIS数据都跑了,在HEGOUT文件夹里有输出的tif。但是值得注意的是,我发现似乎最多只能处理900多一些景。然后就不跑了,不知道为什么

这篇关于基于日期、时间、经纬度下载MODIS数据并批处理的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/368927

相关文章

GSON框架下将百度天气JSON数据转JavaBean

《GSON框架下将百度天气JSON数据转JavaBean》这篇文章主要为大家详细介绍了如何在GSON框架下实现将百度天气JSON数据转JavaBean,文中的示例代码讲解详细,感兴趣的小伙伴可以了解下... 目录前言一、百度天气jsON1、请求参数2、返回参数3、属性映射二、GSON属性映射实战1、类对象映

C# LiteDB处理时间序列数据的高性能解决方案

《C#LiteDB处理时间序列数据的高性能解决方案》LiteDB作为.NET生态下的轻量级嵌入式NoSQL数据库,一直是时间序列处理的优选方案,本文将为大家大家简单介绍一下LiteDB处理时间序列数... 目录为什么选择LiteDB处理时间序列数据第一章:LiteDB时间序列数据模型设计1.1 核心设计原则

Java+AI驱动实现PDF文件数据提取与解析

《Java+AI驱动实现PDF文件数据提取与解析》本文将和大家分享一套基于AI的体检报告智能评估方案,详细介绍从PDF上传、内容提取到AI分析、数据存储的全流程自动化实现方法,感兴趣的可以了解下... 目录一、核心流程:从上传到评估的完整链路二、第一步:解析 PDF,提取体检报告内容1. 引入依赖2. 封装

MySQL中查询和展示LONGBLOB类型数据的技巧总结

《MySQL中查询和展示LONGBLOB类型数据的技巧总结》在MySQL中LONGBLOB是一种二进制大对象(BLOB)数据类型,用于存储大量的二进制数据,:本文主要介绍MySQL中查询和展示LO... 目录前言1. 查询 LONGBLOB 数据的大小2. 查询并展示 LONGBLOB 数据2.1 转换为十

使用SpringBoot+InfluxDB实现高效数据存储与查询

《使用SpringBoot+InfluxDB实现高效数据存储与查询》InfluxDB是一个开源的时间序列数据库,特别适合处理带有时间戳的监控数据、指标数据等,下面详细介绍如何在SpringBoot项目... 目录1、项目介绍2、 InfluxDB 介绍3、Spring Boot 配置 InfluxDB4、I

Python多线程实现大文件快速下载的代码实现

《Python多线程实现大文件快速下载的代码实现》在互联网时代,文件下载是日常操作之一,尤其是大文件,然而,网络条件不稳定或带宽有限时,下载速度会变得很慢,本文将介绍如何使用Python实现多线程下载... 目录引言一、多线程下载原理二、python实现多线程下载代码说明:三、实战案例四、注意事项五、总结引

MySQL按时间维度对亿级数据表进行平滑分表

《MySQL按时间维度对亿级数据表进行平滑分表》本文将以一个真实的4亿数据表分表案例为基础,详细介绍如何在不影响线上业务的情况下,完成按时间维度分表的完整过程,感兴趣的小伙伴可以了解一下... 目录引言一、为什么我们需要分表1.1 单表数据量过大的问题1.2 分表方案选型二、分表前的准备工作2.1 数据评估

Java整合Protocol Buffers实现高效数据序列化实践

《Java整合ProtocolBuffers实现高效数据序列化实践》ProtocolBuffers是Google开发的一种语言中立、平台中立、可扩展的结构化数据序列化机制,类似于XML但更小、更快... 目录一、Protocol Buffers简介1.1 什么是Protocol Buffers1.2 Pro

Python实现数据可视化图表生成(适合新手入门)

《Python实现数据可视化图表生成(适合新手入门)》在数据科学和数据分析的新时代,高效、直观的数据可视化工具显得尤为重要,下面:本文主要介绍Python实现数据可视化图表生成的相关资料,文中通过... 目录前言为什么需要数据可视化准备工作基本图表绘制折线图柱状图散点图使用Seaborn创建高级图表箱线图热

Python中经纬度距离计算的实现方式

《Python中经纬度距离计算的实现方式》文章介绍Python中计算经纬度距离的方法及中国加密坐标系转换工具,主要方法包括geopy(Vincenty/Karney)、Haversine、pyproj... 目录一、基本方法1. 使用geopy库(推荐)2. 手动实现 Haversine 公式3. 使用py