Time Complexity of dplyr functions

  • Thread starter Trollfaz
  • Start date
In summary, the time complexity for basic dplyr functions is O(N), while functions involving joining two data frames have a complexity of O(N+M).
  • #1
Trollfaz
137
14
TL;DR Summary
R
For R's dplyr package this is my query.
Suppose I have a data frame/tibble of n observations or n rows. Let's call it df1. Is the time complexity for dplyr's basic manipulation functions O(N)
filter()
select()
mutate() assuming mutate is O(1)
rename()
summarize()
count()
separate()
unite()
spread()
gather()
If I have another data frame/tibble df2 of m rows, then are the following functions of time complexity O(N+M)
inner_join(df1,df2)
right/left_join(df1,df2)
outer_join(df1,df2)
 
Technology news on Phys.org
  • #2
Yes, the time complexity for dplyr's basic manipulation functions is O(N). filter(), select(), mutate(), rename(), summarize(), count(), separate(), unite(), spread(), and gather() all have a time complexity of O(N). The inner_join(), right/left_join(), and outer_join() functions are of time complexity O(N+M), since they involve combining two data frames of different sizes.
 

Similar threads

  • Programming and Computer Science
Replies
1
Views
613
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
4
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
4
Views
2K
  • Programming and Computer Science
Replies
25
Views
4K
  • Programming and Computer Science
Replies
1
Views
1K
  • Programming and Computer Science
Replies
1
Views
1K
Back
Top