Post

AWS - Security - Amazon CloudWatch (AWS cloud resources)

[toc]


Amazon CloudWatch 

Screen Shot 2020-06-09 at 23.07.07

Screen Shot 2020-08-09 at 22.10.01

  1. a monitoring service for AWS cloud resources and the applications run on AWS.
    • a repository for metric data
    • a service that stores metrics
    • accessed via API, CLI, AWS SDKs, and the AWS Management Console.
    • CloudWatch can’t monitor internal info, like memory usage, for EC2 instances.
  2. CloudWatch integrates with AWS IAM
    • specify which user in your AWS account can perform CloudWatch.
    • create IAM policy, gives only certain users permission to use the GetMetricStatistics operation.
    • They could then use the operation to retrieve data about your cloud resources.
  3. Permission:
    • can’t use IAM to control access to CloudWatch data for specific resources.
    • For example,
      • you can’t give a user access to CloudWatchdata for only a specific set of instances or a specific load balancer.
    • Permissions that are granted through IAM cover all of the cloud resources use with CloudWatch.
    • can’t use IAM roles with the Amazon CloudWatch command line tools
    • The unified CloudWatch Agent
      • runs in the cloud and on-premises, and on Linux and Windows instances and servers.
      • It also handles metrics and log files.
      • can deploy it using AWS Systems Manager, Run Command, SSM State Manager, or from the CLI.
      • doesn’t provide the permissions to write to CloudWatch logs.
      • provides the connection need for on-premise to push data to CloudWatch and CloudWatch logs.
    • execution roles or permission roles:
      • permits a service to write logs to CloudWatch Logs
  4. Monitors, Collects and tracks
    • a distributes metrics gathering system, monitoring and observability service
      • built for DevOps engineers, developers, site reliability engineers (SRE), and IT managers.
    • statistics are recorded for a period of 15 months
    • monitors AWS resources and the applications run on AWS in real time
      • collects and processes raw data from into readable, near-real-time metrics
      • collect and track metrics, measure for
        • you AWS resources and applications.
    • Create and use custom metrics
      • based on data generated by applications and services
      • along with any log files that applications generate.
      • Create custom dashboards for wasy viewing of metrics
    • Example:
      • Lambda functions, Kinesis streams, Amazon ECS tasks, Step Functions state machines,
      • SNS topics, SQS queues, and built-in targets…
      • CloudTrail, Route53, VPC flow logs
      • EC2
        • By default, EC2 provides basic monitoring , sends metric data to CloudWatch in 5-minute periods.
        • To send metric data for your instance to CloudWatch in 1-minute periods, enable detailed monitoring on the instance.
        • EC2 console displays a series of graphs based on the raw data from Amazon CloudWatch.
        • Depending on your needs, you might prefer to get data for your instances from Amazon CloudWatch instead of through the graphs in the console.
        • By default, Amazon CloudWatch does not provide RAM metrics for EC2 instances, though that is an option that you can configure if you want to CloudWatch to collect that data.
      • RDS:
        • number of simultaneous connection: xx for xx min
      • ELB:
        • number of healthy hosts: xx for xx min
      • DynamoDB tables.
      • RDS DB instances.
      • Custom metrics generated by applications and services.
      • Any log files generated by your applications.
  5. gain system-wide visibility about
    • resource use,
    • application performance,
    • operational health.
    • historical information
    • a better perspective on how your web application or service is performing.
  6. produce metrics, and these are time-ordered sets of data.

  7. Turns metrics into statistics

    to be used by CloudWatch alarms

    • Metrics can be configured with alarms that can take action.
      • collects metrics,
      • turns the metrics into statistics that can be used by CloudWatch alarms,
      • and displays them all in one place.
    • CloudWatch alarms are based on statistics.
      • Statistics are metric data that is aggregated over specified periods of time.
      • Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period that you specify.
        • Screen Shot 2020-05-09 at 14.52.17
        • Namespace
          • A namespace contains the CloudWatch metric that you want, for example, AWS/EC2.
        • Metric:
          • the variable you want to measure, for example, CPU Utilization.
        • Statistic:
          • can be an average, sum, minimum, maximum, sample count, a predefined percentile, or a custom percentile.
        • Period:
          • the evaluation period for the alarm.
          • When the alarm is evaluated, each period is aggregated into one data point.
        • Conditions:
          • specify the conditions for a static threshold, you specify whenever the metric is Greater, Greater or Equal, Lower or Equal, or Lowerthan the threshold value, and you also specify the threshold value.
        • Additional configuration information:
          • includes the number of data points within the evaluation period that must be breached to trigger the alarm, and how CloudWatch should treat missing data when it evaluates the alarm.
        • Actions:
          • choose to send a notification to an Amazon SNS topic,
          • or to perform an Amazon EC2 Auto Scaling action or Amazon EC2 action.
  8. Events
    • use CloudWatch Events to define rules
      • match incoming events/changes in AWS environment and route them to targets for processing.
      • operational changes as they occur
      • if a rule is matched, can take corrective action as necessary
    • Targets:
      • EC2 instances, Lambda functions, Kinesis streams, Amazon ECS tasks, Step Functions state machines, SNS topics, SQS queues, and built-in targets…
      • loudWatch Events already has access to AWS API events;
        • only need CloudTrail enabled when services aren’t directly supported.
    • CloudWatch Events becomes aware of operational changes as they occur.
    • CloudWatch Events responds to these operational changes and takes corrective action as necessary, by
      • sending messages to respond to the environment,
      • activating functions,
      • making changes,
      • and capturing state information.
    • 2 options when creating a rule.
      • invoke a target by its event patterns
      • invoke a target by a schedule
  9. Alarms and action
    • key components of a CloudWatch alarm
      • 3 states:
        • insufficient:
          • no enough data to judge the state
          • alarms are often start in this state
        • alarm
          • the alarm threshold has been breached
          • such as: >90% CPU
        • ok
          • the threshold has not been breached
      • thresholds
        • create alarms based on:
          • static thresholds
          • anomaly detection.
          • metric math expression
      • metrics
        • measured data points over time.
      • action
        • action can produce an email or even work in conjunction with Auto Scaling groups.
      • period

        :

        • period is related to its threshold.
        • the length of time in which a threshold is surpassed before an alarm is generated.
    • can be configured with alarms to take actions
    • actions can be used to trigger services
      • a matched rule can take action on a target
      • Automatically react to changes in the AWS resources.
      • use the alarm to
        • automatically send notification to an Simple Notification Service (Amazon SNS) topic
        • triggers or perform an EC2 Auto Scaling in or out or EC2 action based on metrics
        • create alarm to monitor any Amazon CloudWatch metric in account
        • terminate, reboot, or recover an EC2 instance
    • create alarms on
      • the CPU utilization of an EC2 instance,
      • Elastic Load Balancing request latency,
      • Amazon DynamoDB table throughput,
      • Amazon Simple Queue Service (Amazon SQS) queue length,
      • the charges on your AWS bill.
      • custom metrics

        that specific to the custom applications or infrastructure.

  10. no upfront commitment or minimum fee; pay for what you use.

CloudWatch Logs

  1. cloudWatch log group:
    • a container for log streams
    • export and set streams into other AWS services.
    • controls the retention period, metric filter, monitoring, and access control
  2. log stream
    • a sequence of log events with the same source
  3. log event
    • a timestamp and a raw message
  4. metric filter
    • metric filter pattern matches text in all log events in all log streams of whichever log group it’s created on, creating a metric

Screen Shot 2020-05-09 at 14.55.12

  1. CloudWatch Logs
    • monitor and troubleshoot the systems and applications by existing system, application and custom log files.

    • take those logs and send them:
      • be streamed in real time to data-processing solutions, such as Amazon Kinesis Streams or AWS Lambda
      • to an Amazon S3—bucket for durability,
      • have administrators access them directly from the AWS Management Console.
    • With the CloudWatchLogs agent, you can quickly send both rotated and non-rotated log data off a host and into the log service. You can then access the raw log data when you need it.

    • real time application and system monitoring and long term log retention.
      • To store and monitor log data in highly durable storage.
      • keeps logs indefinitely by default.
      • can change the log retention setting, old log events are automatically deleted.
    • CloudTrail logs can be sent to CloudWatch Logs for real-time monitoring.
    • CloudWatch Logs metric filters
      • can evaluate CloudTrail logs for specific terms, phrases or values.
  2. CloudWatch retains metric data as follows:
    • Data points with a period of less than 60 seconds are available for 3 hours. These data points are high-resolution custom metrics.
    • Data points with a period of 60 seconds (1 minute) are available for 15 days.
    • Data points with a period of 300 seconds (5 minute) are available for 63 days.
    • Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months). 

example

Screen Shot 2020-06-24 at 13.05.38

review a cloud design pattern. Monitoring is necessary for system operation. A monitoring service is provided by the AWS Cloud.

  • However, because the monitoring service in the AWS Cloud cannot monitor the internal workings of a virtual server—such as the operating system, middleware, applications, etc.
  • need to have an independent monitoring system.
  • For example,
  • the virtual server is monitored by the AWS Cloud monitoring service,
  • use your own system to monitor the operating system, middleware, applications, etc.
  • The cloud monitoring service provides an API
    • enables you to use your monitoring system to perform centralized control, including of the cloud side
    • through this API, to obtain information from the cloud monitoring system.
  • To implement the monitoring service, install monitoring software on the Amazon EC2 instance so that you can obtain monitoring information from the CloudWatch monitoring service.
    • Install monitoring software, such as Nagios, Zabbix, Munin…
    • Use a plug-in to obtain monitoring information by using the CloudWatch API, and to write that information to the monitoring software.
    • And use the plug-in to perform monitoring, including the information from AWS.

CloudWatch vs CloudTrail:

 Pasted Graphic

.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.