AWS - Security - Amazon CloudWatch (AWS cloud resources)
[toc]
Amazon CloudWatch 
- a monitoring service for AWS cloud resources and the applications run on AWS.
- a repository for metric data
- a service that stores metrics
- accessed via API, CLI, AWS SDKs, and the AWS Management Console.
- CloudWatch can’t monitor internal info, like memory usage, for EC2 instances.
- CloudWatch integrates with AWS IAM
- specify which user in your AWS account can perform CloudWatch.
- create IAM policy, gives only certain users permission to use the
GetMetricStatistics
operation. - They could then use the operation to retrieve data about your cloud resources.
- Permission:
- can’t use IAM to control access to CloudWatch data for specific resources.
- For example,
- you can’t give a user access to CloudWatchdata for only a specific set of instances or a specific load balancer.
- Permissions that are granted through IAM cover all of the cloud resources use with CloudWatch.
- can’t use IAM roles with the Amazon CloudWatch command line tools
- The unified CloudWatch Agent
- runs in the cloud and on-premises, and on Linux and Windows instances and servers.
- It also handles metrics and log files.
- can deploy it using AWS Systems Manager, Run Command, SSM State Manager, or from the CLI.
- doesn’t provide the permissions to write to CloudWatch logs.
- provides the connection need for on-premise to push data to CloudWatch and CloudWatch logs.
- execution roles or permission roles:
- permits a service to write logs to CloudWatch Logs
- Monitors, Collects and tracks
- a distributes metrics gathering system, monitoring and observability service
- built for DevOps engineers, developers, site reliability engineers (SRE), and IT managers.
- statistics are recorded for a period of 15 months
- monitors AWS resources and the applications run on AWS in real time
- collects and processes raw data from into readable, near-real-time metrics
- collect and track metrics, measure for
- you AWS resources and applications.
- Create and use custom metrics
- based on data generated by applications and services
- along with any log files that applications generate.
- Create custom dashboards for wasy viewing of metrics
- Example:
- Lambda functions, Kinesis streams, Amazon ECS tasks, Step Functions state machines,
- SNS topics, SQS queues, and built-in targets…
- CloudTrail, Route53, VPC flow logs
- EC2
- By default, EC2 provides basic monitoring , sends metric data to CloudWatch in 5-minute periods.
- To send metric data for your instance to CloudWatch in 1-minute periods, enable detailed monitoring on the instance.
- EC2 console displays a series of graphs based on the raw data from Amazon CloudWatch.
- Depending on your needs, you might prefer to get data for your instances from Amazon CloudWatch instead of through the graphs in the console.
- By default, Amazon CloudWatch does not provide RAM metrics for EC2 instances, though that is an option that you can configure if you want to CloudWatch to collect that data.
- RDS:
- number of simultaneous connection: xx for xx min
- ELB:
- number of healthy hosts: xx for xx min
- DynamoDB tables.
- RDS DB instances.
- Custom metrics generated by applications and services.
- Any log files generated by your applications.
- a distributes metrics gathering system, monitoring and observability service
- gain system-wide visibility about
- resource use,
- application performance,
- operational health.
- historical information
- a better perspective on how your web application or service is performing.
produce metrics, and these are time-ordered sets of data.
- Turns metrics into statistics
to be used by CloudWatch alarms
- Metrics can be configured with alarms that can take action.
- collects metrics,
- turns the metrics into statistics that can be used by CloudWatch alarms,
- and displays them all in one place.
- CloudWatch alarms are based on statistics.
- Statistics are metric data that is aggregated over specified periods of time.
- Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period that you specify.
- Namespace
- A namespace contains the CloudWatch metric that you want, for example, AWS/EC2.
- Metric:
- the variable you want to measure, for example, CPU Utilization.
- Statistic:
- can be an average, sum, minimum, maximum, sample count, a predefined percentile, or a custom percentile.
- Period:
- the evaluation period for the alarm.
- When the alarm is evaluated, each period is aggregated into one data point.
- Conditions:
- specify the conditions for a static threshold, you specify whenever the metric is Greater, Greater or Equal, Lower or Equal, or Lowerthan the threshold value, and you also specify the threshold value.
- Additional configuration information:
- includes the number of data points within the evaluation period that must be breached to trigger the alarm, and how CloudWatch should treat missing data when it evaluates the alarm.
- Actions:
- choose to send a notification to an Amazon SNS topic,
- or to perform an Amazon EC2 Auto Scaling action or Amazon EC2 action.
- Metrics can be configured with alarms that can take action.
- Events
- use CloudWatch Events to define rules
- match incoming events/changes in AWS environment and route them to targets for processing.
- operational changes as they occur
- if a rule is matched, can take corrective action as necessary
- Targets:
- EC2 instances, Lambda functions, Kinesis streams, Amazon ECS tasks, Step Functions state machines, SNS topics, SQS queues, and built-in targets…
- loudWatch Events already has access to AWS API events;
- only need CloudTrail enabled when services aren’t directly supported.
- CloudWatch Events becomes aware of operational changes as they occur.
- CloudWatch Events responds to these operational changes and takes corrective action as necessary, by
- sending messages to respond to the environment,
- activating functions,
- making changes,
- and capturing state information.
- 2 options when creating a rule.
- invoke a target by its event patterns
- invoke a target by a schedule
- use CloudWatch Events to define rules
- Alarms and action
- key components of a CloudWatch alarm
- 3 states:
- insufficient:
- no enough data to judge the state
- alarms are often start in this state
- alarm
- the alarm threshold has been breached
- such as: >90% CPU
- ok
- the threshold has not been breached
- insufficient:
- thresholds
- create alarms based on:
- static thresholds
- anomaly detection.
- metric math expression
- create alarms based on:
- metrics
- measured data points over time.
- action
- action can produce an email or even work in conjunction with Auto Scaling groups.
- period
:
- period is related to its threshold.
- the length of time in which a threshold is surpassed before an alarm is generated.
- 3 states:
- can be configured with alarms to take actions
- actions can be used to trigger services
- a matched rule can take action on a target
- Automatically react to changes in the AWS resources.
- use the alarm to
- automatically send notification to an Simple Notification Service (Amazon SNS) topic
- triggers or perform an EC2 Auto Scaling in or out or EC2 action based on metrics
- create alarm to monitor any Amazon CloudWatch metric in account
- terminate, reboot, or recover an EC2 instance
- create alarms on
- the CPU utilization of an EC2 instance,
- Elastic Load Balancing request latency,
- Amazon DynamoDB table throughput,
- Amazon Simple Queue Service (Amazon SQS) queue length,
- the charges on your AWS bill.
- custom metrics
that specific to the custom applications or infrastructure.
- key components of a CloudWatch alarm
- no upfront commitment or minimum fee; pay for what you use.
CloudWatch Logs
- cloudWatch log group:
- a container for log streams
- export and set streams into other AWS services.
- controls the retention period, metric filter, monitoring, and access control
- log stream
- a sequence of log events with the same source
- log event
- a timestamp and a raw message
- metric filter
- metric filter pattern matches text in all log events in all log streams of whichever log group it’s created on, creating a metric
- CloudWatch Logs
monitor and troubleshoot the systems and applications by existing system, application and custom log files.
- take those logs and send them:
- be streamed in real time to data-processing solutions, such as Amazon Kinesis Streams or AWS Lambda
- to an Amazon S3—bucket for durability,
- have administrators access them directly from the AWS Management Console.
With the CloudWatchLogs agent, you can quickly send both rotated and non-rotated log data off a host and into the log service. You can then access the raw log data when you need it.
- real time application and system monitoring and long term log retention.
- To store and monitor log data in highly durable storage.
- keeps logs indefinitely by default.
- can change the log retention setting, old log events are automatically deleted.
- CloudTrail logs can be sent to CloudWatch Logs for real-time monitoring.
- CloudWatch Logs metric filters
- can evaluate CloudTrail logs for specific terms, phrases or values.
- CloudWatch retains metric data as follows:
- Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.
- Data points with a period of 60 seconds (1 minute) are available for 15 days.
- Data points with a period of 300 seconds (5 minute) are available for 63 days.
- Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months). 
example
review a cloud design pattern. Monitoring is necessary for system operation. A monitoring service is provided by the AWS Cloud.
- However, because the monitoring service in the AWS Cloud cannot monitor the internal workings of a virtual server—such as the operating system, middleware, applications, etc.
- need to have an independent monitoring system.
- For example,
- the virtual server is monitored by the AWS Cloud monitoring service,
- use your own system to monitor the operating system, middleware, applications, etc.
- The cloud monitoring service provides an API
- enables you to use your monitoring system to perform centralized control, including of the cloud side
- through this API, to obtain information from the cloud monitoring system.
- To implement the monitoring service, install monitoring software on the Amazon EC2 instance so that you can obtain monitoring information from the CloudWatch monitoring service.
- Install monitoring software, such as Nagios, Zabbix, Munin…
- Use a plug-in to obtain monitoring information by using the CloudWatch API, and to write that information to the monitoring software.
- And use the plug-in to perform monitoring, including the information from AWS.
CloudWatch vs CloudTrail:
.
This post is licensed under CC BY 4.0 by the author.
Comments powered by Disqus.