# SOME DESCRIPTIVE TITLE.
# Copyright (C) 2024, OpenStack Foundation
# This file is distributed under the same license as the Swift package.
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: Swift 2.35.0.dev62\n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-11-18 20:34+0000\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"

#: ../../source/ops_runbook/diagnose.rst:3
msgid "Identifying issues and resolutions"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:6
msgid "Is the system up?"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:8
msgid ""
"If you have a report that Swift is down, perform the following basic checks:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:10
msgid "Run swift functional tests."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:12
msgid ""
"From a server in your data center, use ``curl`` to check ``/healthcheck`` "
"(see below)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:15
msgid "If you have a monitoring system, check your monitoring system."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:17
msgid "Check your hardware load balancers infrastructure."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:19
msgid "Run swift-recon on a proxy node."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:22
msgid "Functional tests usage"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:24
msgid ""
"We would recommend that you set up the functional tests to run against your "
"production system. Run regularly this can be a useful tool to validate that "
"the system is configured correctly. In addition, it can provide early "
"warning about failures in your system (if the functional tests stop working, "
"user applications will also probably stop working)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:30
msgid ""
"A script for running the function tests is located in ``swift/.functests``."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:34
msgid "External monitoring"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:36
msgid ""
"We use pingdom.com to monitor the external Swift API. We suggest the "
"following:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:39
msgid "Do a GET on ``/healthcheck``"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:41
msgid ""
"Create a container, make it public (``x-container-read: .r*,.rlistings``), "
"create a small file in the container; do a GET on the object"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:46
msgid "Diagnose: General approach"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:48
msgid "Look at service status in your monitoring system."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:50
msgid ""
"In addition to system monitoring tools and issue logging by users, swift "
"errors will often result in log entries (see :ref:`swift_logs`)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:53
msgid "Look at any logs your deployment tool produces."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:55
msgid ""
"Log files should be reviewed for error signatures (see below) that may point "
"to a known issue, or root cause issues reported by the diagnostics tools, "
"prior to escalation."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:60
msgid "Dependencies"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:62
msgid ""
"The Swift software is dependent on overall system health. Operating system "
"level issues with network connectivity, domain name resolution, user "
"management, hardware and system configuration and capacity in terms of "
"memory and free disk space, may result is secondary Swift issues. System "
"level issues should be resolved prior to diagnosis of swift issues."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:71
msgid "Diagnose: Swift-dispersion-report"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:73
msgid ""
"The swift-dispersion-report is a useful tool to gauge the general health of "
"the system. Configure the ``swift-dispersion`` report to cover at a minimum "
"every disk drive in your system (usually 1% coverage). See :ref:"
"`dispersion_report` for details of how to configure and use the dispersion "
"reporting tool."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:79
msgid ""
"The ``swift-dispersion-report`` tool can take a long time to run, especially "
"if any servers are down. We suggest you run it regularly (e.g., in a cron "
"job) and save the results. This makes it easy to refer to the last report "
"without having to wait for a long-running command to complete."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:86
msgid "Diagnose: Is system responding to ``/healthcheck``?"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:88
msgid ""
"When you want to establish if a swift endpoint is running, run ``curl -k`` "
"against ``https://$ENDPOINT/healthcheck``."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:94
msgid "Diagnose: Interpreting messages in ``/var/log/swift/`` files"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:98
msgid ""
"In the Hewlett Packard Enterprise Helion Public Cloud we send logs to "
"``proxy.log`` (proxy-server logs), ``server.log`` (object-server, account-"
"server, container-server logs), ``background.log`` (all other servers "
"[object-replicator, etc])."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:103
msgid "The following table lists known issues:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:109
msgid "**Logfile**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:110
msgid "**Signature**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:111
msgid "**Issue**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:112
msgid "**Steps to take**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:113
#: ../../source/ops_runbook/diagnose.rst:118
#: ../../source/ops_runbook/diagnose.rst:123
#: ../../source/ops_runbook/diagnose.rst:127
msgid "/var/log/syslog"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:114
msgid "kernel: [] sd .... [csbu:sd...] Sense Key: Medium Error"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:115
msgid "Suggests disk surface issues"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:116
msgid ""
"Run ``swift-drive-audit`` on the target node to check for disk errors, "
"repair disk errors"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:119
msgid "kernel: [] sd .... [csbu:sd...] Sense Key: Hardware Error"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:120
msgid "Suggests storage hardware issues"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:121
msgid ""
"Run diagnostics on the target node to check for disk failures, replace "
"failed disks"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:124
msgid "kernel: [] .... I/O error, dev sd.... ,sector ...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:126
msgid "Run diagnostics on the target node to check for disk errors"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:128
msgid "pound: NULL get_thr_arg"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:129
msgid "Multiple threads woke up"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:130
msgid "Noise, safe to ignore"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:131
#: ../../source/ops_runbook/diagnose.rst:137
msgid "/var/log/swift/proxy.log"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:132
#: ../../source/ops_runbook/diagnose.rst:143
msgid ".... ERROR .... ConnectionTimeout ...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:133
msgid "A storage node is not responding in a timely fashion"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:134
msgid ""
"Check if node is down, not running Swift, unconfigured, storage off-line or "
"for network issues between the proxy and non responding node"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:138
msgid "proxy-server .... HTTP/1.0 500 ...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:139
msgid "A proxy server has reported an internal server error"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:140
msgid ""
"Examine the logs for any errors at the time the error was reported to "
"attempt to understand the cause of the error."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:142
#: ../../source/ops_runbook/diagnose.rst:148
msgid "/var/log/swift/server.log"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:144
#: ../../source/ops_runbook/diagnose.rst:172
msgid "A storage server is not responding in a timely fashion"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:145
#: ../../source/ops_runbook/diagnose.rst:157
msgid ""
"Check if node is down, not running Swift, unconfigured, storage off-line or "
"for network issues between the server and non responding node"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:149
msgid ".... ERROR .... Remote I/O error: '/srv/node/disk...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:150
msgid "A storage device is not responding as expected"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:151
msgid ""
"Run ``swift-drive-audit`` and check the filesystem named in the error for "
"corruption (unmount & xfs_repair). Check if the filesystem is mounted and "
"working."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:154
#: ../../source/ops_runbook/diagnose.rst:160
#: ../../source/ops_runbook/diagnose.rst:166
#: ../../source/ops_runbook/diagnose.rst:170
#: ../../source/ops_runbook/diagnose.rst:175
#: ../../source/ops_runbook/diagnose.rst:180
#: ../../source/ops_runbook/diagnose.rst:184
msgid "/var/log/swift/background.log"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:155
msgid "object-server ERROR container update failed .... Connection refused"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:156
msgid "A container server node could not be contacted"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:161
msgid "object-updater ERROR with remote .... ConnectionTimeout"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:162
msgid "The remote container server is busy"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:163
msgid ""
"If the container is very large, some errors updating it can be expected. "
"However, this error can also occur if there is a networking issue."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:167
msgid "account-reaper STDOUT: .... error: ECONNREFUSED"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:168
msgid "Network connectivity issue or the target server is down."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:169
msgid "Resolve network issue or reboot the target server"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:171
msgid ".... ERROR .... ConnectionTimeout"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:173
#: ../../source/ops_runbook/diagnose.rst:178
msgid ""
"The target server may be busy. However, this error can also occur if there "
"is a networking issue."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:176
msgid ".... ERROR syncing .... Timeout"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:177
msgid "A timeout occurred syncing data to another node."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:181
msgid ".... ERROR Remote drive not mounted ...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:182
#: ../../source/ops_runbook/diagnose.rst:186
msgid "A storage server disk is unavailable"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:183
#: ../../source/ops_runbook/diagnose.rst:187
msgid "Repair and remount the file system (on the remote node)"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:185
msgid "object-replicator .... responded as unmounted"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:188
msgid "/var/log/swift/\\*.log"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:189
msgid "STDOUT: EXCEPTION IN"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:190
msgid "A unexpected error occurred"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:191
msgid ""
"Read the Traceback details, if it matches known issues (e.g. active network/"
"disk issues), check for re-ocurrences after the primary issues have been "
"resolved"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:194
msgid "/var/log/rsyncd.log"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:195
msgid "rsync: mkdir \"/disk....failed: No such file or directory...."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:196
msgid "A local storage server disk is unavailable"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:197
msgid "Run diagnostics on the node to check for a failed or unmounted disk"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:199
msgid "/var/log/swift*"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:200
msgid "Exception: Could not bind to 0.0.0.0:6xxx"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:201
msgid ""
"Possible Swift process restart issue. This indicates an old swift process is "
"still running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:203
msgid ""
"Restart Swift services. If some swift services are reported down, check if "
"they left residual process behind."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:207
msgid "Diagnose: Parted reports the backup GPT table is corrupt"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:209
msgid ""
"If a GPT table is broken, a message like the following should be observed "
"when the following command is run:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:223
msgid "To fix, go to :ref:`fix_broken_gpt_table`"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:227
msgid "Diagnose: Drives diagnostic reports a FS label is not acceptable"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:229
msgid ""
"If diagnostics reports something like  \"FS label: obj001dsk011 is not "
"acceptable\", it indicates that a partition has a valid disk label, but an "
"invalid filesystem label. In such cases proceed as follows:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:233
msgid "Verify that the disk labels are correct:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:241
msgid ""
"If partition labels are inconsistent then, resolve the disk label issues "
"before proceeding:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:250
msgid "If the Filesystem label is missing then create it with care:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:271
msgid "Diagnose: Failed LUNs"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:275
msgid ""
"The HPE Helion Public Cloud uses direct attach SmartArray controllers/"
"drives. The information here is specific to that environment. The hpacucli "
"utility mentioned here may be called hpssacli in your environment."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:280
msgid ""
"The ``swift_diagnostics`` mount checks may return a warning that a LUN has "
"failed, typically accompanied by DriveAudit check failures and device errors."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:284
msgid ""
"Such cases are typically caused by a drive failure, and if drive check also "
"reports a failed status for the underlying drive, then follow the procedure "
"to replace the disk."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:288
msgid "Otherwise the lun can be re-enabled as follows:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:290
msgid ""
"Generate a hpssacli diagnostic report. This report allows the DC team to "
"troubleshoot potential cabling or hardware issues so it is imperative that "
"you run it immediately when troubleshooting a failed LUN. You will come back "
"later and grep this file for more details, but just generate it for now."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:300
msgid ""
"Export the following variables using the below instructions before "
"proceeding further."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:303
msgid ""
"Print a list of logical drives and their numbers and take note of the failed "
"drive's number and array value (example output: \"array A logicaldrive 1..."
"\" would be exported as LDRIVE=1):"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:311
msgid ""
"Export the number of the logical drive that was retrieved from the previous "
"command into the LDRIVE variable:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:318
msgid ""
"Print the array value and Port:Box:Bay for all drives and take note of the "
"Port:Box:Bay for the failed drive (example output: \" array A physicaldrive "
"2C:1:1...\" would be exported as PBOX=2C:1:1). Match the array value of this "
"output with the array value obtained from the previous command to be sure "
"you are working on the same drive. Also, the array value usually matches the "
"device name (For example, /dev/sdc in the case of \"array c\"), but we will "
"run a different command to be sure we are operating on the correct device."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:333
msgid ""
"Sometimes a LUN may appear to be failed as it is not and cannot be mounted "
"but the hpssacli/parted commands may show no problems with the LUNS/drives. "
"In this case, the filesystem may be corrupt and may be necessary to run "
"``sudo xfs_check /dev/sd[a-l][1-2]`` to see if there is an xfs issue. The "
"results of running this command may require that ``xfs_repair`` is run."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:340
msgid "Export the Port:Box:Bay for the failed drive into the PBOX variable:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:346
msgid ""
"Print the physical device information and take note of the Disk Name "
"(example output: \"Disk Name: /dev/sdk\" would be exported as DEV=/dev/sdk):"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:354
msgid ""
"Export the device name variable from the preceding command (example: /dev/"
"sdk):"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:361
msgid ""
"Export the filesystem variable. Disks that are split between the operating "
"system and data storage, typically sda and sdb, should  only have repairs "
"done on their data filesystem, usually /dev/sda2 and /dev/sdb2, Other data "
"only disks have just one partition on the device, so the filesystem will be "
"1. In any case you should verify the data filesystem by running ``df -h | "
"grep /srv/node`` and using the listed data filesystem for the device in "
"question as the export. For example: /dev/sdk1."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:374
msgid "Verify the LUN is failed, and the device is not:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:383
#: ../../source/ops_runbook/diagnose.rst:674
msgid "Stop the swift and rsync service:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:390
msgid "Unmount the problem drive, fix the LUN and the filesystem:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:396
msgid ""
"If umount fails, you should run lsof search for the mountpoint and kill any "
"lingering processes before repeating the unpount:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:404
msgid ""
"If the ``xfs_repair`` complains about possible journal data, use the "
"``xfs_repair -L`` option to zeroise the journal log."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:407
msgid ""
"Once complete test-mount the filesystem, and tidy up its lost and found area."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:416
msgid "Mount the filesystem and restart swift and rsync."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:418
msgid ""
"Run the following to determine if a DC ticket is needed to check the cables "
"on the node:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:426
msgid ""
"If the output reports any non 0x00 values, it suggests that the cables "
"should be checked. For example, log a DC ticket to check the sas cables "
"between the drive and the expander."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:433
msgid "Diagnose: Slow disk devices"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:437
msgid "collectl is an open-source performance gathering/analysis tool."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:439
msgid ""
"If the diagnostics report a message such as ``sda: drive is slow``, you "
"should log onto the node and run the following command (remove ``-c 1`` "
"option to continuously monitor the data):"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:470
msgid ""
"Look at the ``Wait`` and ``SvcTime`` values. It is not normal for these "
"values to exceed 50msec. This is known to impact customer performance "
"(upload/download). For a controller problem, many/all drives will show long "
"wait and service times. A reboot may correct the problem; otherwise hardware "
"replacement is needed."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:476
msgid "Another way to look at the data is as follows:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:507
msgid ""
"This shows the historical distribution of the wait and service times over a "
"day. This is how you read it:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:510
msgid ""
"sda did 54580 operations with a short wait time, 371 operations with a "
"longer wait time and 65 with an even longer wait time."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:513
msgid ""
"sdl did 50106 operations with a short wait time, but as you can see many "
"took longer."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:516
msgid ""
"There is a clear pattern that sdf to sdl have a problem. Actually, sda to "
"sde would more normally have lots of zeros in their data. But maybe this is "
"a busy system. In this example it is worth changing the controller as the "
"individual drives may be ok."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:521
msgid ""
"After the controller is changed, use collectl -s D as described above to see "
"if the problem has cleared. disk-anal.pl will continue to show historical "
"data. You can look at recent data as follows. It only looks at data from "
"13:15 to 14:15. As you can see, this is a relatively clean system (few if "
"any long wait or service times):"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:555
msgid ""
"For long wait times, where the service time appears normal is to check the "
"logical drive cache status. While the cache may be enabled, it can be "
"disabled on a per-drive basis."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:560
msgid "Diagnose: Slow network link - Measuring network performance"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:562
msgid ""
"Network faults can cause performance between Swift nodes to degrade. Testing "
"with ``netperf`` is recommended. Other methods (such as copying large files) "
"may also work, but can produce inconclusive results."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:566
msgid ""
"Install ``netperf`` on all systems if not already installed. Check that the "
"UFW rules for its control port are in place. However, there are no pre-"
"opened ports for netperf's data connection. Pick a port number. In this "
"example, 12866 is used because it is one higher than netperf's default "
"control port number, 12865. If you get very strange results including zero "
"values, you may not have gotten the data port opened in UFW at the target or "
"may have gotten the netperf command-line wrong."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:575
msgid ""
"Pick a ``source`` and ``target`` node. The source is often a proxy node and "
"the target is often an object node. Using the same source proxy you can test "
"communication to different object nodes in different AZs to identity "
"possible bottlenecks."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:581
msgid "Running tests"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:583
msgid "Prepare the ``target`` node as follows:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:589
msgid "Or, do:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:595
msgid ""
"On the ``source`` node, run the following command to check throughput. Note "
"the double-dash before the -P option. The command takes 10 seconds to "
"complete. The ``target`` node is 192.168.245.5."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:610
msgid "On the ``source`` node, run the following command to check latency:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:625
msgid "Expected results"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:627
msgid ""
"Faults will show up as differences between different pairs of nodes. "
"However, for reference, here are some expected numbers:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:630
msgid ""
"For throughput, proxy to proxy, expect ~9300 Mbit/sec  (proxies have a 10Ge "
"link)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:633
msgid ""
"For throughout, proxy to object, expect ~920 Mbit/sec  (at time of writing "
"this, object nodes have a 1Ge link)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:636
msgid "For throughput, object to object, expect ~920 Mbit/sec."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:638
msgid "For latency (all types), expect ~11000 transactions/sec."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:641
msgid "Diagnose: Remapping sectors experiencing UREs"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:643
msgid "Find the bad sector, device, and filesystem in ``kern.log``."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:645
msgid "Set the environment variables SEC, DEV & FS, for example:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:653
msgid "Verify that the sector is bad:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:659
msgid "If the sector is bad this command will output an input/output error:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:667
msgid ""
"Prevent chef from attempting to re-mount the filesystem while the repair is "
"in progress:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:681
msgid "Unmount the problem drive:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:687
msgid "Overwrite/remap the bad sector:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:693
msgid ""
"This command should report an input/output error the first time it is run. "
"Run the command a second time, if it successfully remapped the bad sector it "
"should not report an input/output error."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:697
msgid "Verify the sector is now readable:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:703
msgid ""
"If the sector is now readable this command should not report an input/output "
"error."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:706
msgid ""
"If more than one problem sector is listed, set the SEC environment variable "
"to the next sector in the list:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:713
msgid "Repeat from step 8."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:715
#: ../../source/ops_runbook/diagnose.rst:986
msgid "Repair the filesystem:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:721
msgid ""
"If ``xfs_repair`` reports that the filesystem has valuable filesystem "
"changes:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:739
msgid ""
"You should attempt to mount the filesystem, and clear the lost+found area:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:748
msgid ""
"If the filesystem fails to mount then you will need to use the ``xfs_repair -"
"L`` option to force log zeroing. Repeat step 11."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:752
msgid ""
"If ``xfs_repair`` reports that an additional input/output error has been "
"encountered, get the sector details as follows:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:759
msgid ""
"If new input/output error is reported then set the SEC environment variable "
"to the problem sector number:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:766
msgid "Repeat from step 8"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:769
msgid "Remount the filesystem and restart swift and rsync."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:771
msgid ""
"If all UREs in the kern.log have been fixed and you are still unable to have "
"xfs_repair disk, it is possible that the URE's have corrupted the filesystem "
"or possibly destroyed the drive altogether. In this case, the first step is "
"to re-format the filesystem and if this fails, get the disk replaced."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:779
msgid "Diagnose: High system latency"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:783
msgid ""
"The latency measurements described here are specific to the HPE Helion "
"Public Cloud."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:786
msgid ""
"A bad NIC on a proxy server. However, as explained above, this usually "
"causes the peak to rise, but average should remain near normal parameters. A "
"quick fix is to shutdown the proxy."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:790
msgid ""
"A stuck memcache server. Accepts connections, but then will not respond. "
"Expect to see timeout messages in ``/var/log/proxy.log`` (port 11211). Swift "
"Diags will also report this as a failed node/port. A quick fix is to "
"shutdown the proxy server."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:795
msgid ""
"A bad/broken object server can also cause problems if the accounts used by "
"the monitor program happen to live on the bad object server."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:798
msgid ""
"A general network problem within the data canter. Compare the results with "
"the Pingdom monitors to see if they also have a problem."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:802
msgid "Diagnose: Interface reports errors"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:804
msgid ""
"Should a network interface on a Swift node begin reporting network errors, "
"it may well indicate a cable, switch, or network issue."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:807
msgid "Get an overview of the interface with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:814
msgid ""
"The ``Link Detected:`` indicator will read ``yes`` if the nic is cabled."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:817
msgid "Establish the adapter type with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:823
msgid "Gather the interface statistics with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:829
msgid "If the nick supports self test, this can be performed with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:835
msgid "Self tests should read ``PASS`` if the nic is operating correctly."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:837
msgid ""
"Nic module drivers can be re-initialised by carefully removing and re-"
"installing the modules (this avoids rebooting the server). For example, "
"mellanox drivers use a two part driver mlx4_en and mlx4_core. To reload "
"these you must carefully remove the mlx4_en (ethernet) then the mlx4_core "
"modules, and reinstall them in the reverse order."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:844
msgid ""
"As the interface will be disabled while the modules are unloaded, you must "
"be very careful not to lock yourself out so it may be better to script this."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:849
msgid "Diagnose: Hung swift object replicator"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:851
msgid ""
"A replicator reports in its log that remaining time exceeds 100 hours. This "
"may indicate that the swift ``object-replicator`` is stuck and not making "
"progress. Another useful way to check this is with the 'swift-recon -r' "
"command on a swift proxy server:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:869
msgid ""
"The ``Oldest completion`` line in this example indicates that the object-"
"replicator on swift object server 192.168.245.3 has not completed the "
"replication cycle in 12 days. This replicator is stuck. The object "
"replicator cycle is generally less than 1 hour. Though an replicator cycle "
"of 15-20 hours can occur if nodes are added to the system and a new ring has "
"been deployed."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:876
msgid ""
"You can further check if the object replicator is stuck by logging on the "
"object server and checking the object replicator progress with the following "
"command:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:900
msgid ""
"The above status is output every 5 minutes to ``/var/log/swift/background."
"log``."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:904
msgid ""
"The 'remaining' time is increasing as time goes on, normally the time "
"remaining should be decreasing. Also note the partition number. For example, "
"15344 remains the same for several status lines. Eventually the object "
"replicator detects the hang and attempts to make progress by killing the "
"problem thread. The replicator then progresses to the next partition but "
"quite often it again gets stuck on the same partition."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:911
msgid ""
"One of the reasons for the object replicator hanging like this is filesystem "
"corruption on the drive. The following is a typical log entry of a corrupted "
"filesystem detected by the object replicator:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:923
msgid ""
"An ``ls`` of the problem file or directory usually shows something like the "
"following:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:930
msgid ""
"If no entry with ``Remote I/O error`` occurs in the ``background.log`` it is "
"not possible to determine why the object-replicator is hung. It may be that "
"the ``Remote I/O error`` entry is older than 7 days and so has been rotated "
"out of the logs. In this scenario it may be best to simply restart the "
"object-replicator."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:936
msgid "Stop the object-replicator:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:942
msgid ""
"Make sure the object replicator has stopped, if it has hung, the stop "
"command will not stop the hung process:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:949
msgid ""
"If the previous ps shows the object-replicator is still running, kill the "
"process:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:956
msgid "Start the object-replicator:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:962
msgid ""
"If the above grep did find an ``Remote I/O error`` then it may be possible "
"to repair the problem filesystem."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:965
msgid "Stop swift and rsync:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:972
msgid "Make sure all swift process have stopped:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:978
msgid "Kill any swift processes still running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:980
msgid "Unmount the problem filesystem:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:992
msgid ""
"If the ``xfs_repair`` fails then it may be necessary to re-format the "
"filesystem. See :ref:`fix_broken_xfs_filesystem`. If the ``xfs_repair`` is "
"successful, re-enable chef using the following command and replication "
"should commence again."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:999
msgid "Diagnose: High CPU load"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1001
msgid ""
"The CPU load average on an object server, as shown with the 'uptime' "
"command, is typically under 10 when the server is lightly-moderately loaded:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1010
msgid ""
"During times of increased activity, due to user transactions or object "
"replication, the CPU load average can increase to  to around 30."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1013
msgid ""
"However, sometimes the CPU load average can increase significantly. The "
"following is an example of an object server that has extremely high CPU load:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1023
msgid "Further issues and resolutions"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1027
msgid ""
"The urgency levels in each **Action** column indicates whether or not it is "
"required to take immediate action, or if the problem can be worked on during "
"business hours."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1035
msgid "**Scenario**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1036
msgid "**Description**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1037
msgid "**Action**"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1038
msgid "``/healthcheck`` latency is high."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1039
msgid ""
"The ``/healthcheck`` test does not tax the proxy very much so any drop in "
"value is probably related to network issues, rather than the proxies being "
"very busy. A very slow proxy might impact the average number, but it would "
"need to be very slow to shift the number that much."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1042
msgid ""
"Check networks. Do a ``curl https://<ip-address>:<port>/healthcheck`` where "
"``ip-address`` is individual proxy IP address. Repeat this for every proxy "
"server to see if you can pin point the problem."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1046
msgid ""
"Urgency: If there are other indications that your system is slow, you should "
"treat this as an urgent problem."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1048
msgid "Swift process is not running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1049
msgid ""
"You can use ``swift-init`` status to check if swift processes are running on "
"any given server."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1051
msgid "Run this command:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1057
msgid ""
"Examine messages in the swift log files to see if there are any error "
"messages related to any of the swift processes since the time you ran the "
"``swift-init`` command."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1061
msgid "Take any corrective actions that seem necessary."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1063
#: ../../source/ops_runbook/diagnose.rst:1094
#: ../../source/ops_runbook/diagnose.rst:1107
#: ../../source/ops_runbook/diagnose.rst:1129
msgid ""
"Urgency: If this only affects one server, and you have more than one, "
"identifying and fixing the problem can wait until business hours. If this "
"same problem affects many servers, then you need to take corrective action "
"immediately."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1067
msgid "ntpd is not running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1068
msgid "NTP is not running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1069
msgid "Configure and start NTP."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1071
msgid "Urgency: For proxy servers, this is vital."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1073
msgid "Host clock is not syncd to an NTP server."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1074
msgid ""
"Node time settings does not match NTP server time. This may take some time "
"to sync after a reboot."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1076
msgid ""
"Assuming NTP is configured and running, you have to wait until the times "
"sync."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1077
msgid "A swift process has hundreds, to thousands of open file descriptors."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1078
msgid ""
"May happen to any of the swift processes. Known to have happened with a "
"``rsyslod`` restart and where ``/tmp`` was hanging."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1081
msgid "Restart the swift processes on the affected node:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1088
msgid "If known performance problem: Immediate"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1089
msgid "Urgency:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1090
msgid "If system seems fine: Medium"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1091
msgid "A swift process is not owned by the swift user."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1092
msgid ""
"If the UID of the swift user has changed, then the processes might not be "
"owned by that UID."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1098
msgid "Object account or container files not owned by swift."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1099
msgid ""
"This typically happens if during a reinstall or a re-image of a server that "
"the UID of the swift user was changed. The data files in the object account "
"and container directories are owned by the original swift UID. As a result, "
"the current swift user does not own these files."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1103
msgid ""
"Correct the UID of the swift user to reflect that of the original UID. An "
"alternate action is to change the ownership of every file on all file "
"systems. This alternate action is often impractical and will take "
"considerable time."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1111
msgid "A disk drive has a high IO wait or service time."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1112
msgid ""
"If high wait IO times are seen for a single disk, then the disk drive is the "
"problem. If most/all devices are slow, the controller is probably the source "
"of the problem. The controller cache may also be miss configured – which "
"will cause similar long wait or service times."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1116
msgid ""
"As a first step, if your controllers have a cache, check that it is enabled "
"and their battery/capacitor is working."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1119
msgid ""
"Second, reboot the server. If problem persists, file a DC ticket to have the "
"drive or controller replaced. See :ref:`diagnose_slow_disk_drives` on how to "
"check the drive wait or service times."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1123
msgid "Urgency: Medium"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1124
msgid "The network interface is not up."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1125
msgid ""
"Use the ``ifconfig`` and ``ethtool`` commands to determine the network state."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1126
msgid ""
"You can try restarting the interface. However, generally the interface (or "
"cable) is probably broken, especially if the interface is flapping."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1133
msgid "Network interface card (NIC) is not operating at the expected speed."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1134
msgid ""
"The NIC is running at a slower speed than its nominal rated speed. For "
"example, it is running at 100 Mb/s and the NIC is a 1Ge NIC."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1136
msgid "Try resetting the interface with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1142
msgid "... and then run:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1148
msgid ""
"See if size goes to the expected speed. Failing that, check hardware (NIC "
"cable/switch port)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1151
msgid ""
"If persistent, consider shutting down the server (especially if a proxy) "
"until the problem is identified and resolved. If you leave this server "
"running it can have a large impact on overall performance."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1155
msgid "Urgency: High"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1156
msgid "The interface RX/TX error count is non-zero."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1157
msgid ""
"A value of 0 is typical, but counts of 1 or 2 do not indicate a problem."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1158
msgid ""
"For low numbers (For example, 1 or 2), you can simply ignore. Numbers in the "
"range 3-30 probably indicate that the error count has crept up slowly over a "
"long time. Consider rebooting the server to remove the report from the noise."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1162
msgid ""
"Typically, when a cable or interface is bad, the error count goes to 400+. "
"For example, it stands out. There may be other symptoms such as the "
"interface going up and down or not running at correct speed. A server with a "
"high error count should be watched."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1166
msgid ""
"If the error count continues to climb, consider taking the server down until "
"it can be properly investigated. In any case, a reboot should be done to "
"clear the error count."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1170
msgid "Urgency: High, if the error count increasing."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1172
msgid ""
"In a swift log you see a message that a process has not replicated in over "
"24 hours."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1173
msgid ""
"The replicator has not successfully completed a run in the last 24 hours. "
"This indicates that the replicator has probably hung."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1175
msgid "Use ``swift-init`` to stop and then restart the replicator process."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1177
msgid ""
"Urgency: Low. However if you recently added or replaced disk drives then you "
"should treat this urgently."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1179
msgid "Container Updater has not run in 4 hour(s)."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1180
msgid ""
"The service may appear to be running however, it may be hung. Examine their "
"swift logs to see if there are any error messages relating to the container "
"updater. This may potentially explain why the container is not running."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1183
msgid ""
"Urgency: Medium This may have been triggered by a recent restart of the  "
"rsyslog daemon. Restart the service with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1191
msgid ""
"Object replicator: Reports the remaining time and that time is more than 100 "
"hours."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1192
msgid ""
"Each replication cycle the object replicator writes a log message to its log "
"reporting statistics about the current cycle. This includes an estimate for "
"the remaining time needed to replicate all objects. If this time is longer "
"than 100 hours, there is a problem with the replication process."
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1196
msgid "Urgency: Medium Restart the service with:"
msgstr ""

#: ../../source/ops_runbook/diagnose.rst:1203
msgid "Check that the remaining replication time is going down."
msgstr ""

#: ../../source/ops_runbook/index.rst:3
msgid "Swift Ops Runbook"
msgstr ""

#: ../../source/ops_runbook/index.rst:5
msgid ""
"This document contains operational procedures that Hewlett Packard "
"Enterprise (HPE) uses to operate and monitor the Swift system within the HPE "
"Helion Public Cloud. This document is an excerpt of a larger product-"
"specific handbook. As such, the material may appear incomplete. The "
"suggestions and recommendations made in this document are for our particular "
"environment, and may not be suitable for your environment or situation. We "
"make no representations concerning the accuracy, adequacy, completeness or "
"suitability of the information, suggestions or recommendations. This "
"document are provided for reference only. We are not responsible for your "
"use of any information, suggestions or recommendations contained herein."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:3
msgid "Server maintenance"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:6
msgid "General assumptions"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:8
msgid ""
"It is assumed that anyone attempting to replace hardware components will "
"have already read and understood the appropriate maintenance and service "
"guides."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:12
msgid ""
"It is assumed that where servers need to be taken off-line for hardware "
"replacement, that this will be done in series, bringing the server back on-"
"line before taking the next off-line."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:16
msgid ""
"It is assumed that the operations directed procedure will be used for "
"identifying hardware for replacement."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:20
msgid "Assessing the health of swift"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:22
msgid ""
"You can run the swift-recon tool on a Swift proxy node to get a quick check "
"of how Swift is doing. Please note that the numbers below are necessarily "
"somewhat subjective. Sometimes parameters for which we say 'low values are "
"good' will have pretty high values for a time. Often if you wait a while "
"things get better."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:28
msgid "For example:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:48
msgid ""
"In the example above we ask for information on replication times (-r), load "
"averages (-l) and async pendings (-a). This is a healthy Swift system. Rules-"
"of-thumb for 'good' recon output are:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:52
msgid ""
"Nodes that respond are up and running Swift. If all nodes respond, that is a "
"good sign. But some nodes may time out. For example:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:60
msgid "That could be okay or could require investigation."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:62
msgid ""
"Low values (say < 10 for high and average) for async pendings are good. "
"Higher values occur when disks are down and/or when the system is heavily "
"loaded. Many simultaneous PUTs to the same container can drive async "
"pendings up. This may be normal, and may resolve itself after a while. If it "
"persists, one way to track down the problem is to find a node with high "
"async pendings (with ``swift-recon -av | sort -n -k4``), then check its "
"Swift logs, Often async pendings are high because a node cannot write to a "
"container on another node. Often this is because the node or disk is offline "
"or bad. This may be okay if we know about it."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:73
msgid ""
"Low values for replication times are good. These values rise when new rings "
"are pushed, and when nodes and devices are brought back on line."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:77
msgid ""
"Our 'high' load average values are typically in the 9-15 range. If they are "
"a lot bigger it is worth having a look at the systems pushing the average "
"up. Run ``swift-recon -av`` to get the individual averages. To sort the "
"entries with the highest at the end, run ``swift-recon -av | sort -n -k4``."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:83
msgid ""
"For comparison here is the recon output for the same system above when two "
"entire racks of Swift are down:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:141
msgid ""
"The replication times and load averages are within reasonable parameters, "
"even with 80 object stores down. Async pendings, however is quite high. This "
"is due to the fact that the containers on the servers which are down cannot "
"be updated. When those servers come back up, async pendings should drop. If "
"async pendings were at this level without an explanation, we have a problem."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:149
msgid "Recon examples"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:151
msgid "Here is an example of noting and tracking down a problem with recon."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:153
msgid "Running reccon shows some async pendings:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:171
msgid ""
"Why? Running recon again with -av swift (not shown here) tells us that the "
"node with the highest (23) is <redacted>.72.61. Looking at the log files on "
"<redacted>.72.61 we see:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:207
msgid ""
"That is why this node has a lot of async pendings: a bunch of disks that are "
"not mounted on <redacted> and <redacted>. There may be other issues, but "
"clearing this up will likely drop the async pendings a fair bit, as other "
"nodes will be having the same problem."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:213
msgid "Assessing the availability risk when multiple storage servers are down"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:217
msgid ""
"This procedure will tell you if you have a problem, however, in practice you "
"will find that you will not use this procedure frequently."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:220
msgid ""
"If three storage nodes (or, more precisely, three disks on three different "
"storage nodes) are down, there is a small but nonzero probability that user "
"objects, containers, or accounts will not be available."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:226
msgid "Procedure"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:230
msgid ""
"swift has three rings: one each for objects, containers and accounts. This "
"procedure should be run three times, each time specifying the appropriate "
"``*.builder`` file."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:234
msgid ""
"Determine whether all three nodes are in different Swift zones by running "
"the ring builder on a proxy node to determine which zones the storage nodes "
"are in. For example:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:251
msgid ""
"Here, node <redacted>.4 is in zone 1. If two or more of the three nodes "
"under consideration are in the same Swift zone, they do not have any ring "
"partitions in common; there is little/no data availability risk if all three "
"nodes are down."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:256
msgid ""
"If the nodes are in three distinct Swift zones it is necessary to whether "
"the nodes have ring partitions in common. Run ``swift-ring`` builder again, "
"this time with the ``list_parts`` option and specify the nodes under "
"consideration. For example:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:276
msgid ""
"The ``list_parts`` option to the ring builder indicates how many ring "
"partitions the nodes have in common. If, as in this case,  the first entry "
"in the list has a 'Matches' column of 2 or less,  there is no data "
"availability risk if all three nodes are down."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:281
msgid ""
"If the 'Matches' column has entries equal to 3, there is some data "
"availability risk if all three nodes are down. The risk is generally small, "
"and is proportional to the number of entries that have a 3 in the Matches "
"column. For example:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:301
msgid "A quick way to count the number of rows with 3 matches is:"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:309
msgid ""
"In this case the nodes have 30 out of a total of 2097152 partitions in "
"common; about 0.001%. In this case the risk is small/nonzero. Recall that a "
"partition is simply a portion of the ring mapping space, not actual data. So "
"having partitions in common is a necessary but not sufficient condition for "
"data unavailability."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:317
msgid ""
"We should not bring down a node for repair if it shows Matches entries of 3 "
"with other nodes that are also down."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:320
msgid ""
"If three nodes that have 3 partitions in common are all down, there is a "
"nonzero probability that data are unavailable and we should work to bring "
"some or all of the nodes up ASAP."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:325
msgid "Swift startup/shutdown"
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:327
msgid "Use reload - not stop/start/restart."
msgstr ""

#: ../../source/ops_runbook/maintenance.rst:329
msgid ""
"Try to roll sets of servers (especially proxy) in groups of less than 20% of "
"your servers."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:3
msgid "Software configuration procedures"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:8
msgid "Fix broken GPT table (broken disk partition)"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:10
msgid ""
"If a GPT table is broken, a message like the following should be observed "
"when the command..."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:17
msgid "... is run."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:26
msgid "To fix this, firstly install the ``gdisk`` program to fix this:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:32
msgid "Run ``gdisk`` for the particular drive with the damaged partition:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:55
msgid ""
"On the command prompt, type ``r`` (recovery and transformation options), "
"followed by ``d`` (use main GPT header) , ``v`` (verify disk) and finally "
"``w`` (write table to disk and exit). Will also need to enter ``Y`` when "
"prompted in order to confirm actions."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:93
msgid "Running the command:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:99
msgid "Should now show that the partition is recovered and healthy again."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:101
msgid "Finally, uninstall ``gdisk`` from the node:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:110
msgid "Procedure: Fix broken XFS filesystem"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:112
msgid ""
"A filesystem may be corrupt or broken if the following output is observed "
"when checking its label:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:125
msgid ""
"Run the following commands to remove the broken/corrupt filesystem and "
"replace. (This example uses the filesystem ``/dev/sdb2``) Firstly need to "
"replace the partition:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:168
msgid "Next step is to scrub the filesystem and format:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:186
msgid "You should now label and mount your filesystem."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:188
msgid "Can now check to see if the filesystem is mounted using the command:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:197
msgid "Procedure: Checking if an account is okay"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:201
msgid ""
"``swift-direct`` is only available in the HPE Helion Public Cloud. Use "
"``swiftly`` as an alternate (or use ``swift-get-nodes`` as explained here)."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:205
msgid ""
"You must know the tenant/project ID. You can check if the account is okay as "
"follows from a proxy."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:211
msgid ""
"The response will either be similar to a swift list of the account "
"containers, or an error indicating that the resource could not be found."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:214
msgid ""
"Alternatively, you can use ``swift-get-nodes`` to find the account database "
"files. Run the following on a proxy:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:221
msgid ""
"The response will print curl/ssh commands that will list the replicated "
"account databases. Use the indicated ``curl`` or ``ssh`` commands to check "
"the status and existence of the account."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:226
msgid "Procedure: Getting  swift account stats"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:230
msgid ""
"``swift-direct`` is specific to the HPE Helion Public Cloud. Go look at "
"``swifty`` for an alternate or use ``swift-get-nodes`` as explained in :ref:"
"`checking_if_account_ok`."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:234
msgid ""
"This procedure describes how you determine the swift usage for a given swift "
"account, that is the number of containers, number of objects and total bytes "
"used. To do this you will need the project ID."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:238
msgid "Log onto one of the swift proxy servers."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:240
msgid "Use swift-direct to show this accounts usage:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:256
msgid ""
"This account has 1 container. That container has 8436776 objects. The total "
"bytes used is 67440225625994."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:260
msgid "Procedure: Revive a deleted account"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:262
msgid ""
"Swift accounts are normally not recreated. If a tenant/project is deleted, "
"the account can then be deleted. If the user wishes to use Swift again, the "
"normal process is to create a new tenant/project -- and hence a new Swift "
"account."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:267
msgid ""
"However, if the Swift account is deleted, but the tenant/project is not "
"deleted from Keystone, the user can no longer access the account. This is "
"because the account is marked deleted in Swift. You can revive the account "
"as described in this process."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:274
msgid ""
"The containers and objects in the \"old\" account cannot be listed anymore. "
"In addition, if the Account Reaper process has not finished reaping the "
"containers and objects in the \"old\" account, these are effectively "
"orphaned and it is virtually impossible to find and delete them to free up "
"disk space."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:280
msgid ""
"The solution is to delete the account database files and re-create the "
"account as follows:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:283
msgid ""
"You must know the tenant/project ID. The account name is AUTH_<project-id>. "
"In this example, the tenant/project is ``4ebe3039674d4864a11fe0864ae4d905`` "
"so the Swift account name is ``AUTH_4ebe3039674d4864a11fe0864ae4d905``."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:287
msgid ""
"Use ``swift-get-nodes`` to locate the account's database files (on three "
"servers). The output has been truncated so we can focus on the import pieces "
"of data:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:308
msgid ""
"Before proceeding check that the account is really deleted by using curl. "
"Execute the commands printed by ``swift-get-nodes``. For example:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:318
msgid ""
"Repeat for the other two servers (192.168.245.3 and 192.168.245.4). A ``404 "
"Not Found`` indicates that the account is deleted (or never existed)."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:321
msgid "If you get a ``204 No Content`` response, do **not** proceed."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:323
msgid ""
"Use the ssh commands printed by ``swift-get-nodes`` to check if database "
"files exist. For example:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:336
#: ../../source/ops_runbook/procedures.rst:353
msgid "Repeat for the other two servers (192.168.245.3 and 192.168.245.4)."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:338
msgid "If no files exist, no further action is needed."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:340
msgid ""
"Stop Swift processes on all nodes listed by ``swift-get-nodes`` (In this "
"example, that is 192.168.245.3, 192.168.245.4 and 192.168.245.5)."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:343
msgid "We recommend you make backup copies of the database files."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:345
msgid "Delete the database files. For example:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:355
msgid "Restart Swift on all three servers"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:357
msgid ""
"At this stage, the account is fully deleted. If you enable the auto-create "
"option, the next time the user attempts to access the account, the account "
"will be created. You may also use swiftly to recreate the account."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:363
msgid ""
"Procedure: Temporarily stop load balancers from directing traffic to a proxy "
"server"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:365
msgid ""
"You can stop the load balancers sending requests to a proxy server as "
"follows. This can be useful when a proxy is misbehaving but you need Swift "
"running to help diagnose the problem. By removing from the load balancers, "
"customer's are not impacted by the misbehaving proxy."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:370
msgid ""
"Ensure that in /etc/swift/proxy-server.conf the ``disable_path`` variable is "
"set to ``/etc/swift/disabled-by-file``."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:373
msgid "Log onto the proxy node."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:375
msgid "Shut down Swift as follows:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:383
msgid "Shutdown, not stop."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:385
msgid "Create the ``/etc/swift/disabled-by-file`` file. For example:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:391
msgid "Optional, restart Swift:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:397
msgid ""
"It works because the healthcheck middleware looks for /etc/swift/disabled-by-"
"file. If it exists, the middleware will return 503/error instead of 200/OK. "
"This means the load balancer should stop sending traffic to the proxy."
msgstr ""

#: ../../source/ops_runbook/procedures.rst:402
msgid "Procedure: Ad-Hoc disk performance test"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:404
msgid "You can get an idea whether a disk drive is performing as follows:"
msgstr ""

#: ../../source/ops_runbook/procedures.rst:410
msgid ""
"You can expect ~600MB/sec. If you get a low number, repeat many times as "
"Swift itself may also read or write to the disk, hence giving a lower number."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:3
msgid "Troubleshooting tips"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:6
msgid ""
"Diagnose: Customer complains they receive a HTTP status 500 when trying to "
"browse containers"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:8
msgid ""
"This entry is prompted by a real customer issue and exclusively focused on "
"how that problem was identified. There are many reasons why a http status of "
"500 could be returned. If there are no obvious problems with the swift "
"object store, then it may be necessary to take a closer look at the users "
"transactions. After finding the users swift account, you can search the "
"swift proxy logs on each swift proxy server for transactions from this user. "
"The linux ``bzgrep`` command can be used to search all the proxy log files "
"on a node including the ``.bz2`` compressed files. For example:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:34
msgid "This shows a ``GET`` operation on the users account."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:38
msgid ""
"The HTTP status returned is 404, Not found, rather than 500 as reported by "
"the user."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:40
msgid ""
"Using the transaction ID, ``tx429fc3be354f434ab7f9c6c4206c1dc3`` you can "
"search the swift object servers log files for this transaction ID:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:75
msgid ""
"The 3 GET operations to 3 different object servers that hold the 3 replicas "
"of this users account. Each ``GET`` returns a HTTP status of 404, Not found."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:79
msgid ""
"Next, use the ``swift-get-nodes`` command to determine exactly where the "
"user's account data is stored:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:119
msgid ""
"Check each of the primary servers, <redacted>.31, <redacted>.204.70  and "
"<redacted>.72.16, for this users account. For example on <redacted>.72.16:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:131
msgid ""
"So this users account db, an sqlite db is present. Use sqlite to checkout "
"the account:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:167
msgid ""
"Next try and find the ``DELETE`` operation for this account in the proxy "
"server logs:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:182
msgid ""
"From this you can see the operation that resulted in the account being "
"deleted."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:185
msgid "Procedure: Deleting objects"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:188
msgid "Simple case - deleting small number of objects and containers"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:192
msgid ""
"``swift-direct`` is specific to the Hewlett Packard Enterprise Helion Public "
"Cloud. Use ``swiftly`` as an alternative."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:197
msgid ""
"Object and container names are in UTF8. Swift direct accepts UTF8 directly, "
"not URL-encoded UTF8 (the REST API expects UTF8 and then URL-encoded). In "
"practice cut and paste of foreign language strings to a terminal window will "
"produce the right result."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:202
msgid "Hint: Use the ``head`` command before any destructive commands."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:204
msgid ""
"To delete a small number of objects, log into any proxy node and proceed as "
"follows:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:207
msgid "Examine the object in question:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:213
msgid ""
"See if ``X-Object-Manifest`` or ``X-Static-Large-Object`` is set, then this "
"is the manifest object and segment objects may be in another container."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:217
msgid ""
"If the ``X-Object-Manifest`` attribute is set, you need to find the name of "
"the objects this means it is a DLO. For example, if ``X-Object-Manifest`` is "
"``container2/seg-blah``, list the contents of the container container2 as "
"follows:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:226
msgid ""
"Pick out the objects whose names start with ``seg-blah``. Delete the segment "
"objects as follows:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:235
msgid ""
"If ``X-Static-Large-Object`` is set, you need to read the contents. Do this "
"by:"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:237
msgid "Using swift-get-nodes to get the details of the object's location."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:238
msgid "Change the ``-X HEAD`` to ``-X GET`` and run ``curl`` against one copy."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:239
msgid "This lists a JSON body listing containers and object names"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:240
msgid "Delete the objects as described above for DLO segments"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:242
msgid ""
"Once the segments are deleted, you can delete the object using ``swift-"
"direct`` as described above."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:245
msgid "Finally, use ``swift-direct`` to delete the container."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:248
msgid "Procedure: Decommissioning swift nodes"
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:250
msgid ""
"Should Swift nodes need to be decommissioned (e.g.,, where they are being re-"
"purposed), it is very important to follow the following steps."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:253
msgid ""
"In the case of object servers, follow the procedure for removing the node "
"from the rings."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:255
msgid ""
"In the case of swift proxy servers, have the network team remove the node "
"from the load balancers."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:257
msgid "Open a network ticket to have the node removed from network firewalls."
msgstr ""

#: ../../source/ops_runbook/troubleshooting.rst:259
msgid ""
"Make sure that you remove the ``/etc/swift`` directory and everything in it."
msgstr ""