首頁 Java java教程 保護變形:分析 Kafka 項目

保護變形:分析 Kafka 項目

Oct 16, 2024 pm 08:09 PM

Have you ever wondered what bugs might be lurking in the project source code of global companies? Don't miss the chance to spot interesting bugs detected by the PVS-Studio static analyzer in the open-source Apache Kafka project.

Belay the Metamorphosis: analyzing Kafka project

Introduction

Apache Kafka is a well-known open-source project that's mostly written in Java. LinkedIn developed it in 2011 as a message broker, i.e. a data pipeline for various system components. Today, it's one of the most popular solutions in its category.

Ready to take a look under the hood?

P.S.
Just wanted to leave a quick note about the title. It references Franz Kafka's "The Metamorphosis", where the main character turns into monstrous vermin. Our static analyzer fights to keep your projects from transfiguring into monstrous vermin transforming into one colossal bug, so say no to "The Metamorphosis".

Oh no, bugs

All humor is rooted with pain

These aren't my words; the quote belongs to Richard Pryor. But what does that matter? The first thing I'd like to tell you about is a silly error. Yet, after many attempts to understand why the program doesn't work properly, it's frustrating to encounter something like the example below:

@Override
public KeyValueIterator<Windowed<K>, V> backwardFetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  if (keyFrom == null && keyFrom == null) {   // <=
    kvSubMap = kvMap;
  } else if (keyFrom == null) {
    kvSubMap = kvMap.headMap(keyTo, true);
  } else if (keyTo == null) {
    kvSubMap = kvMap.tailMap(keyFrom, true);
  } else {
    // keyFrom != null and KeyTo != null 
    kvSubMap = kvMap.subMap(keyFrom, true, keyTo, true);
  } 
  ....
}
登入後複製

As you can see, here's something that no developer can avoid—a trivial typo. In the very first condition, developers wanted to use the following logical expression:

keyFrom == null && keyTo == null
登入後複製

The analyzer issued two warnings:

V6001 There are identical sub-expressions 'keyFrom == null' to the left and to the right of the '&&' operator. ReadOnlyWindowStoreStub.java 327, ReadOnlyWindowStoreStub.java 327

V6007 Expression 'keyFrom == null' is always false. ReadOnlyWindowStoreStub.java 329

We can see why. Such chucklesome typos are timeless for every developer. While we can spend a lot of time searching for them, and yet it won't be a piece of cake to recall where it has lurked.

In the same class, there is exactly the same error in another method. I think it's fair to call this copypasta.

@Override
public KeyValueIterator<Windowed<K>, V> fetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  NavigableMap<K, V> kvMap = data.get(now);
  if (kvMap != null) {
    NavigableMap<K, V> kvSubMap;
    if (keyFrom == null && keyFrom == null) {      // <=
      kvSubMap = kvMap;
    } else if (keyFrom == null) {
      kvSubMap = kvMap.headMap(keyTo, true);
    } else if (keyTo == null) {
      kvSubMap = kvMap.tailMap(keyFrom, true);
    } else {
      // keyFrom != null and KeyTo != null
      kvSubMap = kvMap.subMap(keyFrom, true, keyTo, true);
    }
  }
  ....
}
登入後複製

Here are the same warnings:

V6007 Expression 'keyFrom == null' is always false. ReadOnlyWindowStoreStub.java 273

V6001 There are identical sub-expressions 'keyFrom == null' to the left and to the right of the '&&' operator. ReadOnlyWindowStoreStub.java 271, ReadOnlyWindowStoreStub.java 271

No need to worry—we won't have to look at hundreds of code lines at once. PVS-Studio is great at handling such simple things. How about tackling something a little more challenging?

Mutable synchronized

What's the purpose of the synchronized keyword in Java? Here, I'll only focus on the synchronized methods, not blocks. According to the Oracle docs, the synchronized keyword declares a method as synchronized to ensure a thread-safe interaction with an instance. If a thread invokes a synchronized method of the instance, other threads that try to invoke synchronized methods of the same instance will be blocked (i.e. their execution will be suspended). They'll be blocked until the method invoked by the first thread processes its execution. This is needed when the instance is visible to more than one thread. The read/write operations of such instances should be executed only via synchronized methods.

The developers broke the rule in the Sensor class, as illustrated in the simplified code fragment below. The read/write operations on the instance field are executed through both synchronized and unsynchronized methods. It may lead to a race condition and make the output unpredictable.

private final Map<MetricName, KafkaMetric> metrics;

public void checkQuotas(long timeMs) {                  // <=
  for (KafkaMetric metric : this.metrics.values()) {
    MetricConfig config = metric.config();
    if (config != null) {
      ....
    }
  }
  ....
}  

public synchronized boolean add(CompoundStat stat,      // <=
                                MetricConfig config) {       
  ....
  if (!metrics.containsKey(metric.metricName())) {         
    metrics.put(metric.metricName(), metric);            
  }  
  ....
}  

public synchronized boolean add(MetricName metricName,  // <=
                                MeasurableStat stat, 
                                MetricConfig config) {  
  if (hasExpired()) {
    return false;
  } else if (metrics.containsKey(metricName)) {
    return true;
  } else {
    ....
    metrics.put(metric.metricName(), metric);
    return true;
  }
}
登入後複製

The analyzer warning looks like this:

V6102 Inconsistent synchronization of the 'metrics' field. Consider synchronizing the field on all usages. Sensor.java 49, Sensor.java 254

If different threads can change the instance state at once, the methods that allow this should be synchronized. If the program doesn't anticipate that several threads can interact with the instance, it's pointless to make its methods synchronized. In the worst case, it can even damage the program performance.

There are plenty of such errors in the program. Here's a similar code fragment for which the analyzer issued the warning:

private final PrefixKeyFormatter prefixKeyFormatter; 

@Override
public synchronized void destroy() {                // <=
  ....
  Bytes keyPrefix = prefixKeyFormatter.getPrefix();
  ....
}

@Override
public void addToBatch(....) {                      // <=
  physicalStore.addToBatch(
    new KeyValue<>(
    prefixKeyFormatter.addPrefix(record.key),
    record.value
    ), batch
  );
} 

@Override
public synchronized void deleteRange(....) {        // <=
  physicalStore.deleteRange(
    prefixKeyFormatter.addPrefix(keyFrom),
    prefixKeyFormatter.addPrefix(keyTo)
  );
}

@Override
public synchronized void put(....) {                // <=
  physicalStore.put(
    prefixKeyFormatter.addPrefix(key),
    value
  );
}
登入後複製

The analyzer warning:

V6102 Inconsistent synchronization of the 'prefixKeyFormatter' field. Consider synchronizing the field on all usages. LogicalKeyValueSegment.java 60, LogicalKeyValueSegment.java 247

Iterator, iterator, and iterator again...

In the example, there are two rather unpleasant errors within one line at once. I'll explain their nature within the part of the article. Here's a code snippet:

private final Map<String, Uuid> topicIds = new HashMap(); 

private Map<String, KafkaFutureVoid> handleDeleteTopicsUsingNames(....) { 
  ....
  Collection<String> topicNames = new ArrayList<>(topicNameCollection);

  for (final String topicName : topicNames) {
    KafkaFutureImpl<Void> future = new KafkaFutureImpl<>();

    if (allTopics.remove(topicName) == null) {
      ....
    } else {
      topicNames.remove(topicIds.remove(topicName));      // <=
      future.complete(null);
    }
    ....
  }
}
登入後複製

That's what the analyzer shows us:

V6066 The type of object passed as argument is incompatible with the type of collection: String, Uuid. MockAdminClient.java 569

V6053 The 'topicNames' collection of 'ArrayList' type is modified while iteration is in progress. ConcurrentModificationException may occur. MockAdminClient.java 569

Now that's a big dilemma! What's going on here, and how should we address it?!

First, let's talk about collections and generics. Using the generic types of collections helps us avoid ClassCastExceptions and cumbersome constructs where we convert types.

If we specify a certain data type when initializing a collection and add an incompatible type, the compiler won't compile the code.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<String> set = new HashSet<>();
    set.add("str");
    set.add(UUID.randomUUID()); // java.util.UUID cannot be converted to
                                // java.lang.String
  }
}
登入後複製

However, if we delete an incompatible type from our Set, no exception will be thrown. The method returns false.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<String> set = new HashSet<>();
    set.add("abc");
    set.add("def");
    System.out.println(set.remove(new Integer(13))); // false
  }
}
登入後複製

It's a waste of time. Most likely, if we encounter something like this in the code, this is an error. I suggest you go back to the code at the beginning of this subchapter and try to spot a similar case.

Second, let's talk about the Iterator. We can talk about iterating through collections for a long time. I don't want to bore you or digress from the main topic, so I'll just cover the key points to ensure we understand why we get the warning.

So, how do we iterate through the collection here? Here is what the for loop in the code fragment looks like:

for (Type collectionElem : collection) {
  ....
}
登入後複製

The for loop entry is just syntactic sugar. The construction is equivalent to this one:

for (Iterator<Type> iter = collection.iterator(); iter.hasNext();) {
  Type collectionElem = iter.next();
  ....
}
登入後複製

We're basically working with the collection iterator. All right, that's sorted! Now, let's discuss ConcurrentModificationException.

ConcurrentModificationException is an exception that covers a range of situations both in single-threaded and multi-threaded programs. Here, we're focusing on single-threading. We can find an explanation quite easily. Let's take a peek at the Oracle docs: a method can throw the exception when it detects parallel modification of an object that doesn't support it. In our case, while the iterator is running, we delete objects from the collection. This may cause the iterator to throw a ConcurrentModificationException.

How does the iterator know when to throw the exception? If we look at the ArrayList collection, we see that its parent, AbstactList, has the modCount field that stores the number of modifications to the collection:

protected transient int modCount = 0;
登入後複製

Here are some usages of the modCount counter in the ArrayList class:

public boolean add(E e) {
  modCount++;
  add(e, elementData, size);
  return true;
}

private void fastRemove(Object[] es, int i) {
  modCount++;
  final int newSize;
  if ((newSize = size - 1) > i)
    System.arraycopy(es, i + 1, es, i, newSize - i);
  es[size = newSize] = null;
}
登入後複製

So, the counter is incremented each time when the collection is modified.

Btw, the fastRemove method is used in the remove method, which we use inside the loop.

Here's the small code fragment of the ArrayList iterator inner workings:

private class Itr implements Iterator<E> {
  ....
  int expectedModCount = modCount;            

  final void checkForComodification() {
  if (modCount != expectedModCount)               // <=
    throw new ConcurrentModificationException();
  }

  public E next() {
    checkForComodification();              
    ....
  }

  public void remove() {
    ....
    checkForComodification();             

    try {
      ArrayList.this.remove(lastRet);   
      ....
      expectedModCount = modCount;
    } catch (IndexOutOfBoundsException ex) {
      throw new ConcurrentModificationException();
    }
  }
  ....
  public void add(E e) {
    checkForComodification();            
    try {
      ....
      ArrayList.this.add(i, e);        
      ....
      expectedModCount = modCount;     
    } catch (IndexOutOfBoundsException ex) {
      throw new ConcurrentModificationException();
    }
  }
}
登入後複製

Let me explain that last fragment. If the collection modifications don't match the expected number of modifications (which is the sum of the initial modifications before the iterator was created and the number of the iterator operations), a ConcurrentModificationException is thrown. That's only possible when we modify the collection using its methods while iterating over it (i.e. in parallel with the iterator). That's what the second warning is about.

So, I've explained you the analyzer messages. Now let's put it all together:

We attempt to delete an element from the collection when the Iterator is still running:

topicNames.remove(topicIds.remove(topicName)); 
// topicsNames – Collection<String>
// topicsIds – Map<String, UUID>
登入後複製

However, since the incompatible element is passed to ArrayList for deletion (the remove method returns a UUID object from topicIds), the modification count won't increase, but the object won't be deleted. Simply put, that code section is rudimentary.

I'd venture to guess that the developer's intent is clear. If that's the case, one way to fix these two warnings could be as follows:

Collection<String> topicNames = new ArrayList<>(topicNameCollection);

List<String> removableItems = new ArrayList<>();

for (final String topicName : topicNames) {
  KafkaFutureImpl<Void> future = new KafkaFutureImpl<>();

  if (allTopics.remove(topicName) == null) {
    ....
  } else {
    topicIds.remove(topicName);
    removableItems.add(topicName);
    future.complete(null);
  }
  ....
}
topicNames.removeAll(removableItems);
登入後複製

Void, sweet void

Where would we go without our all-time favorite null and its potential problems, right? Let me show you the code fragment for which the analyzer issued the following warning:

V6008 Potential null dereference of 'oldMember' in function 'removeStaticMember'. ConsumerGroup.java 311, ConsumerGroup.java 323

@Override
public void removeMember(String memberId) {
  ConsumerGroupMember oldMember = members.remove(memberId);
  ....
  removeStaticMember(oldMember);
  ....
}

private void removeStaticMember(ConsumerGroupMember oldMember) {
  if (oldMember.instanceId() != null) {
    staticMembers.remove(oldMember.instanceId());
  }
}
登入後複製

If members doesn't contain an object with the memberId key, oldMember will be null. It can lead to a NullPointerException in the removeStaticMember method.

Boom! The parameter is checked for null:

if (oldMember != null && oldMember.instanceId() != null) {
登入後複製

The next error will be the last one in the article—I'd like to wrap things up on a positive note. The code below—as well as the one at the beginning of this article—has a common and silly typo. However, it can certainly lead to unpleasant consequences.

Let's take a look at this code fragment:

protected SchemaAndValue roundTrip(...., SchemaAndValue input) {
  String serialized = Values.convertToString(input.schema(),
                                             input.value());

  if (input != null && input.value() != null) {   
    ....
  }
  ....
}
登入後複製

Yeah, that's right. The method actually accesses the input object first, and then checks whether it's referencing null.

V6060 The 'input' reference was utilized before it was verified against null. ValuesTest.java 1212, ValuesTest.java 1213

Again, I'll note that such typos are ok. However, they can lead to some pretty nasty results. It's tough and inefficient to search for these things in the code manually.

Conclusion

In sum, I'd like to circle back to the previous point. Manually searching through the code for all these errors is a very time-consuming and tedious task. It's not unusual for issues like the ones I've shown to lurk in code for a long time. The last bug dates back to 2018. That's why it's a good idea to use static analysis tools. If you'd like to know more about PVS-Studio, the tool we have used to detect all those errors, you can find out more here.

That's all. Let's wrap things up here. "Oh, and in case I don't see ya, good afternoon, good evening, and good night."

Belay the Metamorphosis: analyzing Kafka project

I almost forgot! Catch a link to learn more about a free license for open-source projects.

以上是保護變形:分析 Kafka 項目的詳細內容。更多資訊請關注PHP中文網其他相關文章!

本網站聲明
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn

熱AI工具

Undresser.AI Undress

Undresser.AI Undress

人工智慧驅動的應用程序,用於創建逼真的裸體照片

AI Clothes Remover

AI Clothes Remover

用於從照片中去除衣服的線上人工智慧工具。

Undress AI Tool

Undress AI Tool

免費脫衣圖片

Clothoff.io

Clothoff.io

AI脫衣器

Video Face Swap

Video Face Swap

使用我們完全免費的人工智慧換臉工具,輕鬆在任何影片中換臉!

熱工具

記事本++7.3.1

記事本++7.3.1

好用且免費的程式碼編輯器

SublimeText3漢化版

SublimeText3漢化版

中文版,非常好用

禪工作室 13.0.1

禪工作室 13.0.1

強大的PHP整合開發環境

Dreamweaver CS6

Dreamweaver CS6

視覺化網頁開發工具

SublimeText3 Mac版

SublimeText3 Mac版

神級程式碼編輯軟體(SublimeText3)

熱門話題

Java教學
1655
14
CakePHP 教程
1413
52
Laravel 教程
1306
25
PHP教程
1252
29
C# 教程
1226
24
公司安全軟件導致應用無法運行?如何排查和解決? 公司安全軟件導致應用無法運行?如何排查和解決? Apr 19, 2025 pm 04:51 PM

公司安全軟件導致部分應用無法正常運行的排查與解決方法許多公司為了保障內部網絡安全,會部署安全軟件。 ...

如何將姓名轉換為數字以實現排序並保持群組中的一致性? 如何將姓名轉換為數字以實現排序並保持群組中的一致性? Apr 19, 2025 pm 11:30 PM

將姓名轉換為數字以實現排序的解決方案在許多應用場景中,用戶可能需要在群組中進行排序,尤其是在一個用...

如何使用MapStruct簡化系統對接中的字段映射問題? 如何使用MapStruct簡化系統對接中的字段映射問題? Apr 19, 2025 pm 06:21 PM

系統對接中的字段映射處理在進行系統對接時,常常會遇到一個棘手的問題:如何將A系統的接口字段有效地映�...

如何優雅地獲取實體類變量名構建數據庫查詢條件? 如何優雅地獲取實體類變量名構建數據庫查詢條件? Apr 19, 2025 pm 11:42 PM

在使用MyBatis-Plus或其他ORM框架進行數據庫操作時,經常需要根據實體類的屬性名構造查詢條件。如果每次都手動...

IntelliJ IDEA是如何在不輸出日誌的情況下識別Spring Boot項目的端口號的? IntelliJ IDEA是如何在不輸出日誌的情況下識別Spring Boot項目的端口號的? Apr 19, 2025 pm 11:45 PM

在使用IntelliJIDEAUltimate版本啟動Spring...

Java對像如何安全地轉換為數組? Java對像如何安全地轉換為數組? Apr 19, 2025 pm 11:33 PM

Java對象與數組的轉換:深入探討強制類型轉換的風險與正確方法很多Java初學者會遇到將一個對象轉換成數組的�...

電商平台SKU和SPU數據庫設計:如何兼顧用戶自定義屬性和無屬性商品? 電商平台SKU和SPU數據庫設計:如何兼顧用戶自定義屬性和無屬性商品? Apr 19, 2025 pm 11:27 PM

電商平台SKU和SPU表設計詳解本文將探討電商平台中SKU和SPU的數據庫設計問題,特別是如何處理用戶自定義銷售屬...

如何利用Redis緩存方案高效實現產品排行榜列表的需求? 如何利用Redis緩存方案高效實現產品排行榜列表的需求? Apr 19, 2025 pm 11:36 PM

Redis緩存方案如何實現產品排行榜列表的需求?在開發過程中,我們常常需要處理排行榜的需求,例如展示一個�...

See all articles