인프라 설정 자동화를 위해 Terraform을 사용하여 HDFS 로그를 분류하기 위해 AWS SageMaker와 해당 Python SDK로 만든 기계 학습 모델입니다.
링크: GitHub
언어: HCL(terraform), Python
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
터미널에 terraform init를 입력하거나 붙여넣어 백엔드를 초기화하세요.
그런 다음 terraform Plan을 입력/붙여넣어 계획을 보거나 단순히 terraform 검증을 수행하여 오류가 없는지 확인하세요.
마지막으로 터미널 유형/붙여넣기 terraform Apply --auto-approve
이렇게 하면 bucket_name과 pretrained_ml_instance_name이라는 두 개의 출력이 표시됩니다(세 번째 리소스는 전역 리소스이므로 버킷에 지정된 변수 이름입니다).
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
프로젝트 디렉터리가 있는 경로로 변경하고 저장하세요.
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
S3 버킷에 데이터 세트를 업로드합니다.
2개의 객체가 업로드된 'data-bucket-'이라는 이름의 S3 버킷, 데이터 세트 및 모델 코드가 포함된 pretrained_sm.ipynb 파일.
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
pretrained_sm.ipynb를 S3에서 Notebook의 Jupyter 환경으로 업로드하는 터미널 명령
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
코드셀 실행 결과
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
8번째 셀 실행
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
23번째 셀 실행
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
24번째 코드셀 실행
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
추가 콘솔 관찰:
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
ClassiSage/downloaded_bucket_content
ClassiSage/.terraform
ClassiSage/ml_ops/pycache
ClassiSage/.terraform.lock.hcl
ClassiSage/terraform.tfstate
ClassiSage/terraform.tfstate.backup
참고:
HDFS 로그 분류를 위해 AWS Cloud의 S3 및 SageMaker를 사용하고 IaC(인프라 설정 자동화)용 Terraform을 사용하는 이 기계 학습 프로젝트의 아이디어와 구현이 마음에 드셨다면 GitHub에서 프로젝트 저장소를 확인한 후 이 게시물에 좋아요를 누르고 별표를 표시해 주시기 바랍니다. .
위 내용은 ClassiSage: Terraform IaC 자동화된 AWS SageMaker 기반 HDFS 로그 분류 모델의 상세 내용입니다. 자세한 내용은 PHP 중국어 웹사이트의 기타 관련 기사를 참조하세요!