구조/목록에서 중복 항목을 제거하고 가장 좋은 것을 보관하십시오

나는 고전적인 중복 항목을 목록/중첩 목록 문제에서 제거해야합니다. 그러나 내가 따르려고하는 특정 규칙 때문에 해결책은 간단하지 않습니다. 원하는대로 작동하는 샘플 응용 프로그램을 작성했습니다. 그러나 그것은 clunky 보인다. 나는 더 우아함을 찾고 있으며 가능하다면 더 많은 효율성을 기대하고있다. 아마도 LINQ/확장 메서드가 도움이 될 수 있습니다. 어떤 제안?구조/목록에서 중복 항목을 제거하고 가장 좋은 것을 보관하십시오

class Program 
{ 
    static void Main(string[] args) 
    { 
     var sellers = new List<Seller>() 
     { 
      new Seller() 
      { 
       Id = 1, 
       Products = new List<Product>() 
       { 
        new Product() { Sku = "Alpha", Price = 5.0, Shipping = 2.0 }, 
        new Product() { Sku = "Beta", Price = 5.0, Shipping = 2.0 }, // more expensive sku within same seller 
        new Product() { Sku = "Beta", Price = 4.0, Shipping = 2.0 }, 
        new Product() { Sku = "Gamma", Price = 8.0, Shipping = 2.0 } 
       } 
      }, 
      new Seller() 
      { 
       Id = 2, 
       Products = new List<Product>() 
       { 
        new Product() { Sku = "Alpha", Price = 5.0, Shipping = 1.0 }, 
        new Product() { Sku = "Beta", Price = 5.0, Shipping = 1.0 }, 
        new Product() { Sku = "Gamma", Price = 8.0, Shipping = 2.0 } 
       } 
      } 
     }; 

     // Eliminate duplicate Products amongst all sellers that have matching "Sku". 
     // Rules: 
     // Keep the Product with the lowest price. 
     // If price is equal, keep the product with lower shipping. 
     // If shipping is also equal, then keep the product with lowest seller Id. 
     // If at the end of all comparisons, a seller ends up with no products, then remove that seller. 

     // In this example, I expect to have (not necessarily in this order): 
     // 1.{Beta, 4.0, 2.0} // Fred.Beta has a lower price than Bob.Beta 
     // 1.{Gamma, 8.0, 2.0} // Fred.Gamma is an identical deal to Bob, but Fred is first in the list 
     // 2.{Alpha, 5.0, 1.0} // Bob.Alpha has a lower shipping cost than Fred.Alpha 

     var newSellers = new List<Seller>(); 

     foreach (var seller in sellers) 
     { 
      foreach (var product in seller.Products) 
      { 
       // TODO: Possible performance improvement? Check for existing seller & product in newSellers before calling any code below. 
       var bestSeller = seller; 
       var bestProduct = product; 
       FindBestSellerAndProduct(sellers, ref bestSeller, ref bestProduct); 
       AddIfNotExists(newSellers, bestSeller, bestProduct); 
      } 
     } 

     newSellers.Sort((x, y) => x.Id.CompareTo(y.Id)); // Ensures the list is sorted by seller id... do I really care? 
    } 

    private static void FindBestSellerAndProduct(IList<Seller> sellers, ref Seller seller, ref Product product) 
    { 
     string sku = product.Sku; 

     foreach (var s in sellers) 
     { 
      foreach (var p in s.Products.Where(x => x.Sku == sku)) 
      { 
       if ((product.Price > p.Price) || 
        (product.Price == p.Price && product.Shipping > p.Shipping) || 
        (product.Price == p.Price && product.Shipping == p.Shipping && seller.Id > s.Id)) 
       { 
        seller = s; 
        product = p; 
       } 
      } 
     } 
    } 

    private static void AddIfNotExists(IList<Seller> sellers, Seller seller, Product product) 
    { 
     var newSeller = sellers.SingleOrDefault(x => x.Id == seller.Id); 
     if (newSeller == null) 
     { 
      // Add input seller and product if seller doesn't already exist in our list. 
      newSeller = new Seller() { Id = seller.Id, Products = new List<Product>() }; 
      newSeller.Products.Add(product); 
      sellers.Add(newSeller); 
     } 
     else 
     { 
      var newProduct = newSeller.Products.Find(x => x.Sku == product.Sku); 
      if (newProduct == null) 
      { 
       // Add input product if it doesn't already exist in our list 
       newSeller.Products.Add(product); 
      } 
     } 
    } 

} 

// I cannot modify the below classes. 
public sealed class Seller 
{ 
    public int Id; 
    public List<Product> Products; 
} 

public sealed class Product 
{ 
    public string Sku; 
    public double Price; 
    public double Shipping; 
}

출처

2012-10-25 Lee Grissom

작동 원리 작업을

var query = sellers.SelectMany(s => s.Products.Select(p => new { 
                SellerId = s.Id, 
                Product = p })) // 1 
        .OrderBy(x => x.Product.Price) // 2 
        .ThenBy(x => x.Product.Shipping) 
        .ThenBy(x => x.SellerId) 
        .GroupBy(x => x.Product.Sku) // 3 
        .Select(g => g.First()) // 4 
        .GroupBy(x => x.SellerId) // 5 
        .Select(g => new Seller() { 
         Id = g.Key, 
         Products = g.Select(x => x.Product).ToList() }) 
        .ToList();

을 할 것이 질의 :

첫 번째 단계 - 익명 형식의 목록에 순서를 평평하게 { settlerId, product }
주문 순서를 조건에 따라 - price, shipping, settlerId
제품 별 순서대로 그룹화 sku. 그들은 { settlerId, product }의 그룹을 생산할 것이며, 제품은 동일한 sku이지만 다른 판매자에 속할 수도 있습니다.
각 그룹에서 첫 번째 항목을 선택하십시오. 따라서 모든 그룹은 귀하의 조건에 따라 정렬됩니다, 그것은 우리에게 동일한 sku와 베스트 셀러 제품을 제공 할 것입니다.
이제 계층을 다시 만들어야합니다. sellerId으로 그룹화하고 Seller 개체를 만듭니다 (있는 경우 모든 베스트셀러 제품 포함). 일부 판매자가 베스트셀러 제품을 보유하지 않은 경우이 판매자의 그룹이 없으며 판매자가 결과에서 삭제됩니다.

출처

2012-10-25 23:29:57

매우 똑똑하고 작동합니다! 고맙습니다. 그것은 내 버전 크기의 1/3이지만 느리게 7 번 실행됩니다. 따라서 실제 프로젝트에서 실제 성능을 관찰하려면 약간의 테스트가 필요합니다. 다시 한번 감사드립니다. –

구조/목록에서 중복 항목을 제거하고 가장 좋은 것을 보관하십시오

답변

관련 문제