位运算概述-C++-PHP中文网

An Overview of Bitwise Operations

以下帖子取自我创建的存储库，旨在帮助学习（和教授）按位运算。您可以在这里找到该存储库，我建议您查看一下 - 那里有一些代码示例和解决方案。

介绍

此存储库的目标是让您熟悉按位运算，解释它们是什么、它们如何工作以及它们的用途。

第一章：一切都是二进制的

在 C（以及大多数高级语言）中，我们的变量具有类型。这些类型表明了一些事情。当然，int 类型的变量将存储整数值，但理解这些按位运算的关键是要知道，在幕后，所有类型都以二进制形式存储在内存中（任何位置、堆栈、堆，无论何处）。以下是当您在 C:
中的堆栈上存储简单整数值时会发生的情况的示例

int main(int argc, char** argv) {
    int x = 2;
    return 0;
}

登录后复制

编译为汇编后，代码可能如下所示（我这里使用的是ARM汇编，并且我已经使用注释对代码进行了注释）：

.section .text
.global main

main:
    ; Set up the stack for the function
    stp x29, x30 [sp, -16]! ; Save previous frame pointer & link register
    mov x29, sp ; Setup a new frame pointer

    ; Initialize x with 2
    ; IN C: int x = 2;
    mov w0, 2 ; Move 2 into the w0 register
    str w0, [sp, 16] ; Store the contents of w0 (2) at a 16-byte offset from the stack pointer
    ; Essentially, the above line stores 2 on the stack.

    mov w0, 0 ; Move 0 into w0, prepare for return

    ; Clear stack
    ldp x29, x30, [sp], 32 ; Restore frame pointer and link register
    ret ; Return

登录后复制

请注意，大多数编译器不会实际存储像我在堆栈上显示的那样的变量，因为它未被使用。但是，如果多次使用，它会像上面一样存储在堆栈中。

如果我们查看变量在堆栈中存储的位置（当然，它就在那里），我们会看到类似以下内容：

Memory Address	Value Stored (Hex)	Value Stored (Binary)
0x1000	0x02	00000010
0x1001	0x00	00000000
0x1002	0x00	00000000
0x1003	0x00	00000000

这假设您的系统是小尾数法。我不会在这里讨论字节顺序，但您可以在这里阅读更多相关信息。

关于上表，我希望您注意的关键一点是，尽管我们的整数只有 2 位长，但它占用了 4 个字节（32 位）的内存。现在，不要惊慌——这是正常现象，也是意料之中的。 C 和编译器所做的许多事情之一就是为您调用的类型设置标准。因此，当我创建 int 变量时，编译器知道分配 4 个字节（同样是 32 位）内存。我们甚至可以使用 C 中的 sizeof() 运算符来测试这一点。

sizeof() 运算符

sizeof() 不是实际的 C 函数。相反，在编译时，编译器会将表达式替换为给定数据类型的大小。您甚至可以将其与您自己的类型一起使用，例如 typedef 和/或结构：
#include <stdio.h> 

typedef struct {
  char name[64];
  int age;
} Person;

int main(int argc, char** argv) {
  printf("A Person is %lu bytes long.\n", sizeof(Person));
  return 0;
}
登录后复制

您可能会问的另一件事是如何存储负数。很好的问题。数字可以是有符号或无符号，但默认情况下，它们是有符号的。如果一个整数是有符号的，它将牺牲其最高有效位作为“符号位”。如果符号位为1，则该数为负；否则它是积极的。精明的读者可能会意识到这里发生的变化在可能的数字范围内。考虑 8 位数字。有 256 个可能的数字可以表示（由 2^8 给出）。使用无符号8位整数，可以表示0-255的值；使用有符号的 8 位 int，可以表示 -128–127。

要获得二进制数的倒数，请使用二进制补码。我们来求二进制的 -5。

从5开始。在二进制中，5是0101。前导0是符号。
反转每一位。 0101→1010。
将此数字加 1（忽略任何可能的溢出）。 1010 + 0001 = 1011.

轮到你了！

通过对 -5 进行二进制补码得到 5 或 0101，确认 -5 是二进制的 1011。
编写一个 C 程序，以字节和位的形式打印 int 的大小。使用上面的代码作为起点。 提示：从字节转换为位，一个字节有多少位？
将不同类型的尺寸填写到下表中，修改您的程序来检查它们。

Type	Size (bytes)	Size (bits)
int
int64_t
int8_t
char
bool (you'll need to #include )
long int
short int
long long int
double
double

Sample Responses

Question 1

Start with -5: 1011.
Invert each bit: 1011 → 0100.
Add 1: 0100 + 0001 = 0101

Question 2

Here's an example of what your simple program might look like (you can also check it out at Chapter 1/SizeOfOperatorTest.c).

 #include <stdio.h>

 int main(int argc, char** argv) {
    printf("The size of an int is %lu bytes, or %lu bits.\n", sizeof(int), sizeof(int) * 8);
    return 0;
 }

登录后复制

Go ahead and compile it using gcc and check out its output:

cd Chapter\ 1
gcc -o sizeof SizeOfOperatorTest.c
./sizeof

登录后复制

Question 3

Type	Size (bytes)	Size (bits)
int	4	32
int64_t	8	64
int8_t	1	8
char	1	8
bool (you'll need to #include )	1	8
long int	4	32
short int	2	16
long long int	8	64
double	4	32
double	8	64

Take This Home

The main point I'd like you to keep in mind is that with control of every bit, we can optimize our memory usage. Though that has little effect on modern systems, in the case of embedded computing, every byte counts. By manually reading and writing bits as opposed to typical variable values, we can harness more functionality from less storage.

Chapter 2: Operating on Bits

Now that we've covered data types and how data is stored, we're ready to introduce the idea of bitwise operations. Put simply, a bitwise operation is an operation done on each bit of a piece of data. Let's take a look at each bitwise operator. Then, we'll implement them in C.

And (&)

Written x & y in C. This operator takes the corresponding bits of each number and performs an and operation. An and operation returns 1 (true) if and only if both bits are 1. This means that two bits that are both 0 do not return 1—they return 0. The result is the number made up of the binary string of results from each and. It's easiest to see this in a truth table.

Consider the operation int result = 3 & 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 & 101. Consider the following truth table:

Bit A	Bit B	AND
0	1	0
1	0	0
1	1	1

The result of the and operation is 001, which when converted to decimal is 1.

Or (|)

Written x | y in C. This operator takes the corresponding bits of each number and performs an or operation. An or operation returns 1 if either bit is 1. As with other bitwise operators, the result is the number made up of the binary string of results from each or.

Consider the operation int result = 3 | 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 | 101. Consider the following truth table:

Bit A	Bit B	OR
0	1	1
1	0	1
1	1	1

The result of the or operation is 111, which when converted to decimal is 7.

Xor (^)

Written x ^ y in C. This operator takes the corresponding bits of each number and performs an xor (exclusive or) operation. An xor operation returns 1 if and only if one of the two bits is 1. As with other bitwise operators, the result is the number made up of the binary string of results from each or.

Consider the operation int result = 3 ^ 5. First, convert 3 and 5 to binary.
Now, we have int result = 011 ^ 101. Consider the following truth table:

Bit A	Bit B	XOR
0	1	1
1	0	1
1	1	0

异或运算的结果是110，转换成十进制后是6。

左移 (<<)

写x << amount 与上述运算符不同，该运算符仅对一个数字进行操作。给定数字中的每一位都会向左移动给定的数量。到达数字末尾的任何位都会被截断（并且零出现在右侧）。让我们来看一个例子。

考虑 int result = 5 << 2. 如您所知，5 的二进制数是 101。让我们来看看转变的每次迭代。

初始

1	0	1

一班后

0	1	0

结果

1	0	0

二进制结果是100，十进制是4。

右移 (>>)

写x>> amount 这个运算符就像左移一样，只是它的工作方向相反。给定数字中的每一位都会向右移动给定的数量。到达数字末尾的任何位都会被截断（并且零出现在左侧）。让我们来看一个例子。

考虑 int result = 3 >> 2. 如您所知，3 的二进制是 011。让我们来看看转变的每次迭代。

初始

0	1	1

一班后

0	0	1

结果

0	0	0

二进制结果是000，十进制就是0。

不是(~)

写~x。 not 运算符反转给定数字的所有位。我们将再次使用真值表来详细说明。

考虑 int result = ~5。如您所知，5 的二进制数是 101。

Bit A	~ Bit A
1	0
0	1
1	0

因此，非运算的结果是010，即二进制的2。

左移和右移限制

这些轮班操作有一些值得注意的限制。对于初学者来说，您不能将位移位负数 - 这根本没有意义！此外，移位超过分配给变量的位数将被视为未定义。你可以做到这一点，但它的输出不能保证对于给定值是恒定的。最后，虽然不是一个限制，但移位 0 次根本不会执行移位。

轮到你了！

完成以下每一项的真值表。考虑所有值都是无符号的。完成后转换为十进制。
8 & 2
6 | 3
7 ^ 5
(5 & 2) & (4 & 3)
(5 | 2) & (4 & 3)
(5 & 2) ^ (4 | 3)
完成每项操作。考虑所有值都是无符号的，并且只要问题中的最长值需要是无符号的（即，如果最大值为 8，则处理 4 位）。完成后转换为十进制。
~6
9 << 4（这里考虑该值的长度为32，因此有左移的空间）。
~(7 & 8)
(2 | 6) >> 1
8>> (~2)
~((3 >> 2) ^ ~7) & (~(7 >> 4) | 2)

回复示例

问题1

8 & 2 → 1000 & 0010

Bit A	Bit B	AND
1	0	0
0	0	0
0	1	0
0	0	0

⇒ 0000，即十进制的 0。

6 | 3 → 110 | 011

Bit A	Bit B	OR
1	0	1
1	1	1
0	1	1

⇒ 111，十进制为 7。

7^5→111^101

Bit A	Bit B	XOR
1	1	0
1	0	1
1	1	0

⇒ 010，十进制 2。

(5 & 2) & (4 & 3) → (101 & 010) & (100 & 011)

Bit A	Bit B	Bit C	Bit D
1	0	1	0
0	1	0	1
1	0	0	1

⇒ 000，即十进制的 0。

(5 | 2) & (4 & 3) → (101 | 010) & (100 & 011)

Bit A	Bit B	A OR B	Bit C	Bit D
1	0	1	1	0
0	1	1	0	1
1	0	1	0	1

⇒ 000，即十进制的 0。

(5 & 2) ^ (4 | 3) → (101 & 010) ^ (100 | 011)

Bit A	Bit B	Bit C	Bit D	C OR D	(A AND B) XOR (C OR D)
1	0	1	0	1	1
0	1	0	1	1	1
1	0	0	1	1	1

⇒ 111, which is 7 in decimal.

Question 2

~6 → ~110 ⇒ 011, which is 3 in decimal.
9 << 4 → 001001 << 4 ⇒ 100100, which is 36 in decimal.
~(7 & 8) → ~(0111 & 1000) → ~(0000) ⇒ 1111, which is 15 in decimal.
(2 | 6) >> 1 → (010 | 110) >> 1 → 110 >> 1 ⇒ 011, which is 3 in decimal.
8 >> (~2) → 1000 >> ~(10) → 1000 >> (01) → 1000 >> 1 ⇒ 0100, which is 4 in decimal.
~((3 >> 2) ^ ~7) & (~(7 >> 4) | 2)

To solve this, I suggest splitting into halves:

~((3 >> 2) ^ ~7) and (~(7 >> 4) | 2)

~((3 >> 2) ^ ~7) → ~((011 >> 2) ^ ~(111)) → ~((000) ^ ~(111)) → ~(000 ^ 000) → 111
(~(7 >> 4) | 2) → (~(111 >> 4) | 010) → (~(000) | 010) → (111 | 010) → 111

Hence, 111 & 111 ⇒ 111, which is 7 in decimal.

Chapter 3: Applying Bitwise Operations in C

This chapter is all about writing C code that utilizes bitwise operators. Before we get to doing bitwise operations, we should begin by writing a function that can write the binary equivalent of a given variable.

To do this, we use a mask. We initialize it to contain a 1 in the most significant (leftmost in little-endian systems) bit followed by zeros. With each iteration of a loop, we right shift the mask by 1, moving the 1 all the way "across" the mask. When we use the & operator on the pointer and the mask, any non-zero value means that a 1 occurred somewhere in the result. Since there's only one 1 in the mask, we know exactly where this occurred. Since the loop moves from left to right, we can just append the corresponding bit's character to the string. The string is one character longer than the size because it needs to contain the null character (\0). This is how printf knows to stop printing, and omitting it can lead to numerous errors and/or unexpected behaviors (like the printing of other data from in memory).

void printBinary(unsigned int decimal) {

    // To determine size (in bits), we multiply the maximum size of an unsigned int by 8 (to convert from bytes to bits).
    int size = sizeof(decimal) * 8;

    // This counts the leading zeros of the number, then subtracts them from the size.
    // Hence, we start at the first bit set to 1 (unless decimal == 0)
    size -= __builtin_clz(decimal);

    if(size == 0) size = 1; // Handle zero case, we need one space to display "0."

    int curr = 0;
    char binary[size + 1];
    for(unsigned int mask = 1 << (size - 1); mask != 0; mask >>= 1) {
        if((decimal & mask) != 0) {
            binary[curr] = '1';
        } else {
            binary[curr] = '0';
        }
        curr++;
    }

    binary[curr] = '\0';

    printf("%s", binary);

}

登录后复制

Bitwise Assignment Operators

All bitwise operators, except for not (~), can be used in the assignment fashion. This means you can add an equals sign right next to one of the bitwise operator. For example, in
int a = 2;
a &= 7;
登录后复制
a is both the first operand and the destination. In other words, the value of a & 7 is stored in a. This works for all bitwise operators aside from the not (~) operator.

Now, I'd like to provide a few case study like prompts for you to try. Sample responses are available.

Case Study 1: Unix File Permissions

One use case of bitwise operations is in the Unix permission system. You've probably run the command

chmod 777 some-file

登录后复制

But what do each of the numbers mean? And why 7? The reality is, binary is at play here, and 7 should tip you off. Recall that 7 in binary is 111. The number being passed here is acting as three flags or booleans glued together.

The three digits specified are for three classes of users:

The file owner;
Group members of the file owner;
and everyone else.

As I mentioned above, each digit is a conglomeration of three flags, each representing a permission. The flag (binary bit) in the fours place represents read permission, the twos place is for write permission, and the ones is for execute. So,

chmod 777 some-file

is doing this under the hood: