使用 AWS SageMaker 及其 Python SDK 製作的機器學習模型,用於使用 Terraform 實現基礎架構設定自動化的 HDFS 日誌分類。
連結:GitHub
語言:HCL(terraform)、Python
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
在終端機中輸入/貼上 terraform init 來初始化後端。
然後輸入/貼上 terraform Plan 以查看計劃或簡單地進行 terraform 驗證以確保沒有錯誤。
最後在終端機中輸入/貼上 terraform apply --auto-approve
這將顯示兩個輸出,一個作為bucket_name,另一個作為pretrained_ml_instance_name(第三個資源是賦予儲存桶的變數名稱,因為它們是全域資源)。
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
並將其變更為專案目錄所在的路徑並儲存。
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
將資料集上傳到 S3 Bucket。
名為「data-bucket-」的 S3 儲存桶,上傳了 2 個物件、一個資料集和包含模型程式碼的 pretrained_sm.ipynb 檔案。
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
將 pretrained_sm.ipynb 從 S3 上傳到 Notebook 的 Jupyter 環境的終端命令
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
程式碼單元執行的輸出
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
執行第 8 個單元
xgb_predictor = estimator.deploy(initial_instance_count=1,instance_type='ml.m5.large')
執行第23個單元
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
執行第 24 個代碼單元
hyperparameters = { "max_depth":"5", ## Maximum depth of a tree. Higher means more complex models but risk of overfitting. "eta":"0.2", ## Learning rate. Lower values make the learning process slower but more precise. "gamma":"4", ## Minimum loss reduction required to make a further partition on a leaf node. Controls the model’s complexity. "min_child_weight":"6", ## Minimum sum of instance weight (hessian) needed in a child. Higher values prevent overfitting. "subsample":"0.7", ## Fraction of training data used. Reduces overfitting by sampling part of the data. "objective":"binary:logistic", ## Specifies the learning task and corresponding objective. binary:logistic is for binary classification. "num_round":50 ## Number of boosting rounds, essentially how many times the model is trained. } # A SageMaker estimator that calls the xgboost-container estimator = sagemaker.estimator.Estimator(image_uri=container, # Points to the XGBoost container we previously set up. This tells SageMaker which algorithm container to use. hyperparameters=hyperparameters, # Passes the defined hyperparameters to the estimator. These are the settings that guide the training process. role=sagemaker.get_execution_role(), # Specifies the IAM role that SageMaker assumes during the training job. This role allows access to AWS resources like S3. train_instance_count=1, # Sets the number of training instances. Here, it’s using a single instance. train_instance_type='ml.m5.large', # Specifies the type of instance to use for training. ml.m5.2xlarge is a general-purpose instance with a balance of compute, memory, and network resources. train_volume_size=5, # 5GB # Sets the size of the storage volume attached to the training instance, in GB. Here, it’s 5 GB. output_path=output_path, # Defines where the model artifacts and output of the training job will be saved in S3. train_use_spot_instances=True, # Utilizes spot instances for training, which can be significantly cheaper than on-demand instances. Spot instances are spare EC2 capacity offered at a lower price. train_max_run=300, # Specifies the maximum runtime for the training job in seconds. Here, it's 300 seconds (5 minutes). train_max_wait=600) # Sets the maximum time to wait for the job to complete, including the time waiting for spot instances, in seconds. Here, it's 600 seconds (10 minutes).
額外的控制台觀察:
estimator.fit({'train': s3_input_train,'validation': s3_input_test})
# Looks for the XGBoost image URI and builds an XGBoost container. Specify the repo_version depending on preference. container = get_image_uri(boto3.Session().region_name, 'xgboost', repo_version='1.0-1')
ClassiSage/downloaded_bucket_content
ClassiSage/.terraform
ClassiSage/ml_ops/pycache
ClassiSage/.terraform.lock.hcl
ClassiSage/terraform.tfstate
ClassiSage/terraform.tfstate.backup
注意:
如果您喜歡這個機器學習專案的想法和實現,該專案使用AWS Cloud 的S3 和SageMaker 進行HDFS 日誌分類,使用Terraform 進行IaC(基礎設施設定自動化),請在查看GitHub 上的專案儲存庫後考慮喜歡這篇文章並加星號.
以上是ClassiSage:基於 Terraform IaC 自動化 AWS SageMaker HDFS 日誌分類模型的詳細內容。更多資訊請關注PHP中文網其他相關文章!